Skip to content

High CPU usage - futex_wait, gettimeofday syscalls #8735

@megastef

Description

@megastef

Got situation with very high CPU usage on node 4.5 and 6.x.
The actual node.js process get stuck (no logs etc ...).
running ctrace prints this:

sudo ctrace -p 15084 -v
[15217] Trace on:  attach to process 15084
[pid:15094] [08:52:00.317489] futex, 202 -- fast user-space locking linux (0x1df27c4, FUTEX_WAIT_PRIVATE, 660, NULL <unfinished ...>
[pid:15090] [08:52:00.317855] futex, 202 -- fast user-space locking linux (0x2ab7f68, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid:15095] [08:52:00.328036] futex, 202 -- fast user-space locking linux (0x1df27c4, FUTEX_WAIT_PRIVATE, 660, NULL <unfinished ...>
[pid:15093] [08:52:00.328051] futex, 202 -- fast user-space locking linux (0x1df27c4, FUTEX_WAIT_PRIVATE, 660, NULL <unfinished ...>
[pid:15092] [08:52:00.328058] futex, 202 -- fast user-space locking linux (0x1df27c4, FUTEX_WAIT_PRIVATE, 660, NULL <unfinished ...>
[pid:15089] [08:52:00.328065] futex, 202 -- fast user-space locking linux (0x2ab7f68, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid:15088] [08:52:00.328072] futex, 202 -- fast user-space locking linux (0x2ab7f68, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid:15087] [08:52:00.328078] futex, 202 -- fast user-space locking linux (0x2ab7f68, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>````

Executing interrupted by signal ...
-----------------------------------------------------------------------------------------------------------------------------------------------------
SYSCALL         TIME %     SECOND     CALLS      DESCRIPTION             ERRORNO (COUNT)
*               100        0.000000   8          -                       (0)            
futex           NaN        0.000000   8          fast user-space locking -              
-----------------------------------------------------------------------------------------------------------------------------------------------------

Also seen often before:

futex           0.0        0.000438   827        fast user-space locking                          EAGAIN (Resource temporarily unavailable) (3)

After a few minutes the process contiunes to run. Then I get this in ctrace:

Executing interrupted by signal ...
-----------------------------------------------------------------------------------------------------------------------------------------------------
SYSCALL         TIME %     SECOND     CALLS      DESCRIPTION                                    ERRORNO (COUNT)                              
*               100        0.688409   15652      -                                              (6)                                          
gettimeofday    95.4       0.656960   14499      get / set time                                 -                                            
clock_gettime   3.9        0.026934   598        clock and time functions                       -                                            
mmap            0.2        0.001334   47         map or unmap files or devices into memory      -                                            
write           0.1        0.000379   9          write to a file descriptor                     -                                            
socket          0.1        0.000574   8          create an endpoint for communication           -                                            
pread           0.1        0.000500   1          read or write data into multiple               -                                            
getsockname     0.0        0.000028   6          get socket name                                -                                            
getdents        0.0        0.000094   2          get directory entries                          -                                            
close           0.0        0.000147   12         close a file descriptor                        -                                            
open            0.0        0.000194   5          open and possibly create a file                -                                            
read            0.0        0.000083   10         read from a file descriptor                    -                                            
epoll_wait      0.0        0.000033   3          wait for an I/O event on an epoll file         -                                            
stat            0.0        0.000040   5          get file status                                -                                            
bind            0.0        0.000013   2          bind a name to a socket                        -                                            
futex           0.0        0.000252   361        fast user-space locking                        EAGAIN (Resource temporarily unavailable) (4)
sendto          0.0        0.000063   4          send a message on a socket                     -                                            
recvmsg         0.0        0.000028   4          receive a message from a socket                -                                            
fstat           0.0        0.000015   3          get file status                                -                                            
getcwd          0.0        0.000080   1          get current working directory                  -                                            
munmap          0.0        0.000076   8          map or unmap files or devices into memory      -                                            
connect         0.0        0.000099   10         initiate a connection on a socket              EINPROGRESS (Operation now in progress) (2)  
poll            0.0        0.000012   6          wait for some event on a file descriptor       -                                            
ioctl           0.0        0.000034   6          control device                                 -                                            
recvfrom        0.0        0.000012   2          receive a message from a socket                -                                            
openat          0.0        0.000012   1          open and possibly create a file                -                                            
epoll_ctl       0.0        0.000047   2          control interface for an epoll file descriptor -                                            
madvise         0.0        0.000303   31         give advice about use of memory                -                                            
mprotect        0.0        0.000018   4          set protection on a region of memory           -                                            
getsockopt      0.0        0.000045   2          get and set options on sockets                 -                                            
-----------------------------------------------------------------------------------------------------------------------------------------------------

Any ideas to how I could find the "gettime" calls, or debug this further?

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionIssues that look for answers.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions