1

Topic: Latest version (from nuget) does not always work on Linux/dotnet

I have a simple program that talks to an HID device
It runs fine on Windows/.NET, Linux/Mono, Windows/dotnet but not so much on Linux/dotnet

Basically it crashes when enumerating the HID devices:

AvailableDevices = DeviceList.Local.GetHidDevices(VID, PID).Select(d => new HidDeviceUI(d)).ToList();

     
Sometimes this works and returns the list of devices, but most of the times it results in the following error:

double free or corruption (out)

(this happens as both a normal user and as root user)

Please note that all OS/Framework combinations have been tested on two distinct computers.

Dotnet version on Linux: 2.2.103
Ubuntu 18.0.4.1

2

Re: Latest version (from nuget) does not always work on Linux/dotnet

Did the earlier version 2.0.5 have this same problem for you? Also, is it possible to get a stack trace for the error?

3

Re: Latest version (from nuget) does not always work on Linux/dotnet

I will give that version a try and get back to you

4

Re: Latest version (from nuget) does not always work on Linux/dotnet

2.0.5 gives the same result.
I am unable to catch it as an exception as the error occurs at the native level.

What else can  provide that might help you?

5

Re: Latest version (from nuget) does not always work on Linux/dotnet

Any update on this?
Anything else I can do?

6

Re: Latest version (from nuget) does not always work on Linux/dotnet

I did some debugging on Linux and traced the native error back to the following call:

NativeMethodsLibudev.Instance.udev_device_unref(device)

on line 92 of LinuxHidDevice.cs

It seems this method triggers the native error for some but not all devices.

7

Re: Latest version (from nuget) does not always work on Linux/dotnet

It turns out the use of multiple threads is what's causing the issue. 
Modifying the existing method to match the following code instead:

public IEnumerable<Device> GetDevices()
{
    Device[] deviceList;
    
    lock (_getDevicesLock)
    {
        TypedKey[] devices = GetAllDeviceKeys();
        TypedKey[] additions = devices.Except(_deviceList.Keys).ToArray();
        TypedKey[] removals = _deviceList.Keys.Except(devices).ToArray();

        if (additions.Length > 0)
        {
            foreach (TypedKey addition in additions)
            {
                var typedKey = (TypedKey)addition;

                Device device = null; bool created;

                switch (typedKey.Type)
                {
                    case KeyType.Hid:
                        created = TryCreateHidDevice(typedKey.Key, out device);
                        break;

                    case KeyType.Serial:
                        created = TryCreateSerialDevice(typedKey.Key, out device);
                        break;

                    default:
                        created = false; Debug.Assert(false);
                        break;
                }

                if (created)
                {                            
                    // By not adding on failure, we'll end up retrying every time.
                    lock (_deviceList)
                    {
                        _deviceList.Add(typedKey, device);
                        //           Debug.Print("** HIDSharp detected a new device: {0}", typedKey.Key);
                    }
                }              
            }
        }

        foreach (TypedKey removal in removals)
        {
            _deviceList.Remove(removal);
            Debug.Print("** HIDSharp detected a device removal: {0}", removal.Key);
        }
        deviceList = _deviceList.Values.ToArray();
    }

    return deviceList;
}

While using a single thread the operation succeeds 100% of the time and there does not seem to be a noticeable impact on the performance.
The resulting code could be cleaned up further after this modification, but I did not find the official git repository for this project.
If there was one I'd be more than happy to help out.

8 (edited by drigolin 2019-03-08 04:33:08)

Re: Latest version (from nuget) does not always work on Linux/dotnet

I have same issue on Linux. But I'm using a simple CLI test program doing only the enumeration of HidDevices (local.GetHidDevices() method call crash.) no multithreading.
I works sometime (about 10%).

It seems related to some specific USB Devices, or is very unstable in case is connected an Hid Device with no name the 3rd one in the list of lsusb (fffe:0091).

Bus 001 Device 004: ID 04f3:2234 Elan Microelectronics Corp.
Bus 001 Device 003: ID 0cf3:e300 Atheros Communications, Inc.
Bus 001 Device 013: ID fffe:0091   
Bus 001 Device 008: ID 05ac:0267 Apple, Inc.
Bus 001 Device 007: ID 2109:2813 VIA Labs, Inc.
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub

If I connect this device getdevice() methods fails most of the time... But sometimes it works is I connect and then disconnect the device.

I'm running on Ubuntu 18.04 64Bit.

I'm doing some debugging using the HidSharp.Test project to try to identify the issue. I discovered that internally the library is using multiple threads to add devices.

9 (edited by drigolin 2019-03-08 05:51:16)

Re: Latest version (from nuget) does not always work on Linux/dotnet

After some debugging the "double free error" on GetDevices() on Linux I have found that this error is raised by this line in LinuxHidDevices.cs

NativeMethodsLibudev.Instance.udev_device_unref(device);

I have commented that line and now the GetDevices() works all the time,

finally
                        {
                          //  NativeMethodsLibudev.Instance.udev_device_unref(device);
                        }
                    }
                }
                finally
                {
                    NativeMethodsLibudev.Instance.udev_unref(udev);
                }
            }

Looking around on the web I have found on signal11 hidapi project those comment on codes:

Lines 560 of github.com/signal11/hidapi/blob/master/linux/hid.c

/* hid_dev, usb_dev and intf_dev don't need to be (and can't be)
           unref()d.  It will cause a double-free() error.  I'm not
           sure why.  */

It seems that hid device doesn't need to be unref on Linux....

10

Re: Latest version (from nuget) does not always work on Linux/dotnet

That call to free the device ref is correct, though. The documentation suggests refcount will be 1.

By any chance, do you have two copies of libudev on your system?

If so, could you try changing, in NativeMethodsLibudev.cs,
foreach (var instance in new NativeMethodsLibudev[] { new NativeMethodsLibudev0(), new NativeMethodsLibudev1() })
to
foreach (var instance in new NativeMethodsLibudev[] { new NativeMethodsLibudev1(), new NativeMethodsLibudev0() })

Right now, it tries the old library first, so if there are bug fixes in the new we wouldn't see them..

11

Re: Latest version (from nuget) does not always work on Linux/dotnet

Hi I have only one libudev (udev1) and an exception is raised looking for udev0 but after that a correct instance for udev1 is created. I changed also this code to avoid the exception but the "double free" error still there.
Looking around it seems that on linux hid usb device suffer this issue of double free, I was looking if udev library has some kind of way to prevent an undef call in some way but nothing. At the moment I'm using your library with this line disabled and all is working fine. In my PC I have an Apple Keyboard connected with wire and on linux create 4 hidraw devices.

12

Re: Latest version (from nuget) does not always work on Linux/dotnet

Are you using it to talk to the Apple keyboard? If so, is it able to talk to all of the hidraw devices?
I wonder if there is something special about the udev information for this device.

13

Re: Latest version (from nuget) does not always work on Linux/dotnet

I need to talk with a RFID Reader , I have to write commands and get responses. Everything works fine after commenting the dispose. I have to try on windows and MacOS to see if everything works fine on that OSs.

My guess is that devices creating multiple /dev/hidraw devices maybe suffer of this double free issue, Apple Keyboard and this RFID reader create multiple devices.

GetDevices() method crash very frequently, if I remove my RFID Reader it happens less frequently, but it happens...
Looking at node-hid module implementation also there I saw mentions to "double free" issue on linux due to undef methods.

At the moment avoiding the the free seems not to have side effects.

14

Re: Latest version (from nuget) does not always work on Linux/dotnet

> I have same issue on Linux. But I'm using a simple CLI test program doing only the enumeration of HidDevices (local.GetHidDevices() method call crash.) no multithreading.

That method in the library tries to multi-thread and that is why it is failing
The solution to this problem is simply modifying the code so that Get*Devices methods don't multithread.
Been using the modified version in  production for a while now without any issues.

15

Re: Latest version (from nuget) does not always work on Linux/dotnet

This may actually be related to Utf8Marshaler (see https://bugzilla.xamarin.com/show_bug.cgi?id=4722).

Would you mind trying this version?
https://www.zer7.com/files/oss/hidsharp … -04-26.zip

Thanks!

16

Re: Latest version (from nuget) does not always work on Linux/dotnet

using pro micro arduino with RAWHID example on BananpiM1 as HID device that send data on 333 ms

start console programs generates error

root@bananapi:/home/dare# mono ConsoleApplication1.exe
priprema
priprema2
.*1**11**111**** Error in `mono': double free or corruption (out): 0xb452f8b8 ***

=================================================================
        Native Crash Reporting
=================================================================
Got a SIGABRT while executing native code. This usually indicates
a fatal error in the mono runtime or one of the native libraries
used by your application.
=================================================================
/proc/self/maps:
004cb000-007a3000 r-xp 00000000 08:01 1580626    /usr/bin/mono-sgen
007b3000-007b7000 r-xp 002d8000 08:01 1580626    /usr/bin/mono-sgen
007b7000-007bb000 rwxp 002dc000 08:01 1580626    /usr/bin/mono-sgen
007bb000-007c7000 rwxp 00000000 00:00 0
01bec000-01d78000 rwxp 00000000 00:00 0          [heap]
b3ffc000-b3ffd000 ---p 00000000 00:00 0
b3ffd000-b40fd000 rwxp 00000000 00:00 0
b40fd000-b40fe000 ---p 00000000 00:00 0
b40fe000-b41fe000 rwxp 00000000 00:00 0
b41fe000-b41ff000 ---p 00000000 00:00 0
b41ff000-b42ff000 rwxp 00000000 00:00 0
b42ff000-b4300000 ---p 00000000 00:00 0
b4300000-b4400000 rwxp 00000000 00:00 0
b4400000-b4436000 rwxp 00000000 00:00 0
b4436000-b4500000 ---p 00000000 00:00 0
b4500000-b454b000 rwxp 00000000 00:00 0
b454b000-b4600000 ---p 00000000 00:00 0
b4600000-b4601000 ---p 00000000 00:00 0
b4601000-b4701000 rwxp 00000000 00:00 0
b4701000-b4806000 r-xp 00000000 08:01 2100606    /usr/lib/mono/gac/System.Core/4.0.0.0__b77a5c561934e089/System.Core.dll
b4806000-b4f90000 r-xp 00000000 08:01 2101564    /usr/lib/mono/aot-cache/arm/mscorlib.dll.so
b4f90000-b4f9f000 ---p 0078a000 08:01 2101564    /usr/lib/mono/aot-cache/arm/mscorlib.dll.so
b4f9f000-b4fa0000 r-xp 00789000 08:01 2101564    /usr/lib/mono/aot-cache/arm/mscorlib.dll.so
b4fa0000-b4fa1000 rwxp 0078a000 08:01 2101564    /usr/lib/mono/aot-cache/arm/mscorlib.dll.so
b4fa1000-b4fb5000 rwxp 00000000 00:00 0

=================================================================
        Basic Fault Adddress Reporting
=================================================================
Memory around native instruction pointer (0xb6dec6f6):0xb6dec6e6  0d 00 42 39 0d 00 b4 00 00 00 80 b5 67 46 00 df  ..B9........gF..
0xb6dec6f6  80 bd 03 4b 1d ee 70 0f 7b 44 1b 68 18 44 70 47  ...K..p.{D.h.DpG
0xb6dec706  00 bf b6 39 0d 00 08 b5 00 23 08 f0 c8 ff 08 f0  ...9.....#......
0xb6dec716  64 ba 03 4b 1d ee 70 0f 7b 44 1b 68 18 44 70 47  d..K..p.{D.h.DpG

=================================================================
        Native stacktrace:
=================================================================
         (No frames)


=================================================================
        Telemetry Dumper:
=================================================================
Pkilling 0xb6fef000 from 0xb41fd450
Pkilling 0xb6b63450 from 0xb41fd450
Pkilling 0xb43ff450 from 0xb41fd450
Pkilling 0xb42fe450 from 0xb41fd450
Pkilling 0xb4700450 from 0xb41fd450
Pkilling 0xb40fc450 from 0xb41fd450
Entering thread summarizer pause from 0xb41fd450
Finished thread summarizer pause from 0xb41fd450.

Waiting for dumping threads to resume

Debug info from gdb:


=================================================================
        External Debugger Dump:
=================================================================
mono_gdb_render_native_backtraces not supported on this platform, unable to find gdb or lldb

=================================================================
        Managed Stacktrace:
=================================================================
          at <unknown> <0xffffffff>
          at HidSharp.Platform.Linux.NativeMethodsLibudev1:native_udev_device_unref <0x00037>
          at HidSharp.Platform.Linux.NativeMethodsLibudev1:udev_device_unref <0x00017>
          at HidSharp.Platform.Linux.LinuxHidDevice:TryCreate <0x004c3>
          at HidSharp.Platform.Linux.LinuxHidManager:TryCreateHidDevice <0x0003b>
          at <>c__DisplayClass5:<GetDevices>b__3 <0x000cf>
          at System.Threading.QueueUserWorkItemCallback:WaitCallback_Context <0x0006b>
          at System.Threading.ExecutionContext:RunInternal <0x0021f>
          at System.Threading.ExecutionContext:Run <0x0002b>
          at System.Threading.QueueUserWorkItemCallback:System.Threading.IThreadPoolWorkItem.ExecuteWorkItem <0x00093>
          at System.Threading.ThreadPoolWorkQueue:Dispatch <0x0025f>
          at System.Threading._ThreadPoolWaitCallback:PerformWaitCallback <0x0000b>
          at <Module>:runtime_invoke_bool <0x0006f>
=================================================================
Aborted


AFTER DOWNLOAD HIDSharp_test_2019-04-26.zip AND TESTED IT OK NOW
WORKING

17

Re: Latest version (from nuget) does not always work on Linux/dotnet

Hi,
So I updated to your latest version and this specific problem seems to be fixed.
However, I still believe the multi-threaded code is of no use here and it actually has a side effect:

With the multi-threaded version devices are listed out of order so you may have to sort them,
the single threaded version is sorted.