Skip to content
This repository has been archived by the owner on Aug 23, 2024. It is now read-only.

Problem occurs when using libmpk with multiple threads #1

Open
guozhenwuai opened this issue Sep 8, 2020 · 1 comment
Open

Problem occurs when using libmpk with multiple threads #1

guozhenwuai opened this issue Sep 8, 2020 · 1 comment

Comments

@guozhenwuai
Copy link

I wrote a program as follows, and found that some threads exited abnormally when executing mpt_begin mpt_end.

#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <mpt/mpt.h>
#include <climits>
#include <fcntl.h>
#include <sys/mman.h>

#define DOM_NUM 1024
#define MAX_THREAD 12

char *iso_data[DOM_NUM];
void (*funcs[DOM_NUM])(int);
int vpkeys[DOM_NUM];

void chain_func(int i) {
    mpt_begin(vpkeys[i], PROT_READ | PROT_WRITE);
    mpt_end(vpkeys[i]);
    funcs[i+1](i+1);
}

void end_func(int i) {
    mpt_begin(vpkeys[i], PROT_READ | PROT_WRITE);
    mpt_end(vpkeys[i]);
}

void init_domain(int num) {
    mpt_init(-1);
    for (int i = 0; i < num; i++) {
        vpkeys[i] = mpt_mmap((void **)&iso_data[i], 0x1000, PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE);
        funcs[i] = chain_func;
    }
    funcs[num - 1] = end_func;
}

typedef struct thread_info {
    pthread_t pid;
    unsigned long long res;
} info_t;

info_t threads_info[MAX_THREAD];                                                                     

void *worker_func(void *arg) {
                    
    int count = 10000;

    unsigned long long start = rdtsc();
    for (int i = 0; i < count; i++) {
        funcs[0](0);                             
    }                  
    unsigned long long end = rdtsc();
 
    ((info_t *)arg)->res = (end - start) / count;
    printf("thread with res %llu\n", (end - start)/ count);
                 
    return NULL;
}                                  
                                                                    
int main(int argc, char *argv[]) {
    int num = 30;
    int threads = 2;                                                                     
                       
    init_domain(num);         
                    
    for (int i = 0; i < threads; i++) {
        int ret = pthread_create(&threads_info[i].pid, NULL, &worker_func, &threads_info[i]);
        if (ret < 0) {                 
            printf("pthread_create failed\n");
            return -1;                           
        }              
    }                                
 
                                                 
    unsigned long long total = 0;                          
    for (int i = 0; i < threads; i++) {
        int ret = pthread_join(threads_info[i].pid, NULL);
        if (ret < 0) {             
            printf("pthread_join failed\n");                                                         
            return -1;            
        }                     
        total += threads_info[i].res;                                               
        printf("thread %d time %llu\n", i, threads_info[i].res);
    }                         

                            
    unsigned long long avg = total / threads;
                                      
    printf("%d,%llu,\n", num, avg);         
                           
    return 0;
}

The program chained num functions, inside each function it switches domain. When there is only one thread (threads = 1 in main()), the program works well. However, if threads = 2, some of threads may corrupt and exit early than return NULL in worker_func, as the res field turned out to be 0 for these threads.

And I found error messages thrown by kernel:

[77496.266736] BUG: unable to handle kernel NULL pointer dereference at 0000000000000014
[77496.275514] IP: SyS_pkey_mprotect_evict+0x80/0xb0                                 
[77496.281130] PGD 0 P4D 0                                                      
[77496.284558] Oops: 0002 [#48] SMP                                             
[77496.288651] Modules linked in: xt_addrtype iptable_filter br_netfilter bridge stp llc overlay inte
l_rapl binfmt_misc x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_p
clmul crc32_pclmul ghash_clmulni_intel mgag200 ttm pcbc drm_kms_helper aesni_intel drm ipmi_ssif aes_
x86_64 crypto_simd glue_helper cryptd fb_sys_fops syscopyarea sysfillrect ipmi_si sysimgblt dcdbas in
tel_cstate mei_me ipmi_devintf ipmi_msghandler intel_rapl_perf nfit mac_hid acpi_power_meter mei acpi
_pad lpc_ich sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 igb megaraid_sas i2c
_algo_bit ahci dca libahci                                                                 
[77496.348245] CPU: 34 PID: 76459 Comm: main Tainted: G      D    OE   4.14.2-mpk #1                 
[77496.356684] Hardware name: Dell Inc. PowerEdge R640/008R9M, BIOS 1.2.71 12/04/2017
[77496.365194] task: ffff9244e83b8000 task.stack: ffffb6cd49b0c000              
[77496.372043] RIP: 0010:SyS_pkey_mprotect_evict+0x80/0xb0                      
[77496.378177] RSP: 0018:ffffb6cd49b0ff40 EFLAGS: 00010246
[77496.384293] RAX: 00000000ffffffff RBX: 00000000ffffffff RCX: 0000000000000000
[77496.392299] RDX: 0000000000000000 RSI: 0000000000001000 RDI: 00007fdd3b5e7000
[77496.400297] RBP: ffffb6cd49b0ff48 R08: 00000000ffffffff R09: 0000000000000000
[77496.408287] R10: ffff9244e0b01000 R11: ffff9244e83b8000 R12: 0000000000000005
[77496.416272] R13: 0000000000000003 R14: 0000000000000000 R15: 00007ffc663f3ac0
[77496.424263] FS:  00007fdd39b6d700(0000) GS:ffff9244fc040000(0000) knlGS:0000000000000000
[77496.433208] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[77496.439823] CR2: 0000000000000014 CR3: 0000001fe1609005 CR4: 00000000004606e0
[77496.447829] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[77496.455835] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[77496.463823] PKRU: 55555550
[77496.467388] Call Trace:
[77496.470688]  entry_SYSCALL_64_fastpath+0x1e/0xa9
[77496.476142] RIP: 0033:0x7fdd3adc9839
[77496.480556] RSP: 002b:00007fdd39b6cd98 EFLAGS: 00000246 ORIG_RAX: 0000000000000151
[77496.488976] RAX: ffffffffffffffda RBX: 00007fdd2c000000 RCX: 00007fdd3adc9839
[77496.496964] RDX: 0000000000000000 RSI: 0000000000001000 RDI: 00007fdd3b5e7000
[77496.504953] RBP: 0000000000021000 R08: 00000000ffffffff R09: 0000000000000000
[77496.512940] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fdd30000000
[77496.520925] R13: 0000000000000000 R14: 00005596e56f2050 R15: 00007ffc663f3ac0
[77496.528907] Code: f8 1f 41 c1 e8 12 44 01 c0 25 ff 3f 00 00 44 29 c0 4c 63 c0 4f 8d 0c 40 49 c1 e1
 04 4d 01 d1 45 8b 01 41 83 f8 ff 75 cb 45 31 c9 <41> c7 41 14 ff ff ff ff e8 b3 97 da ff 5b 5d c3 49
 83 c1 08 41 
[77496.549502] RIP: SyS_pkey_mprotect_evict+0x80/0xb0 RSP: ffffb6cd49b0ff40
[77496.557086] CR2: 0000000000000014
[77496.561312] ---[ end trace 65f6d43ddbec5cb1 ]---

I have the following questions:

  • Do I use the interfaces provided by libmpk correctly? I am worried about that the problem is caused by coding incorrectly.
  • Does libmpk support multi-threaded programming? If so, can you provide a runnable version of multi-threaded benchmark? Or if there are some bugs, can you fix them?
@andsanmar
Copy link

andsanmar commented Sep 7, 2021

Hi @guozhenwuai , diving in the codebase and comparing the kernel provided with the original linux one, I found a change in mm/mprotect.c that may be of your interest: https://github.com/sslab-gatech/libmpk/blob/master/kernel/mm/mprotect.c#L747

I guess that's the issue you're reaching in this case. The codebase has not been updated for more than 2 years, so I suppose the authors won't address it.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants