[{"id":"36763646704","type":"IssueCommentEvent","actor":{"id":1299177,"login":"ghuls","display_login":"ghuls","gravatar_id":"","url":"https://api.github.com/users/ghuls","avatar_url":"https://avatars.githubusercontent.com/u/1299177?"},"repo":{"id":28554304,"name":"ebiggers/libdeflate","url":"https://api.github.com/repos/ebiggers/libdeflate"},"payload":{"action":"created","issue":{"url":"https://api.github.com/repos/ebiggers/libdeflate/issues/335","repository_url":"https://api.github.com/repos/ebiggers/libdeflate","labels_url":"https://api.github.com/repos/ebiggers/libdeflate/issues/335/labels{/name}","comments_url":"https://api.github.com/repos/ebiggers/libdeflate/issues/335/comments","events_url":"https://api.github.com/repos/ebiggers/libdeflate/issues/335/events","html_url":"https://github.com/ebiggers/libdeflate/issues/335","id":2063286720,"node_id":"I_kwDOAbO0QM56-0HA","number":335,"title":"I added stream & multi-thread support for libdeflate, need help!","user":{"login":"sisong","id":2347214,"node_id":"MDQ6VXNlcjIzNDcyMTQ=","avatar_url":"https://avatars.githubusercontent.com/u/2347214?v=4","gravatar_id":"","url":"https://api.github.com/users/sisong","html_url":"https://github.com/sisong","followers_url":"https://api.github.com/users/sisong/followers","following_url":"https://api.github.com/users/sisong/following{/other_user}","gists_url":"https://api.github.com/users/sisong/gists{/gist_id}","starred_url":"https://api.github.com/users/sisong/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/sisong/subscriptions","organizations_url":"https://api.github.com/users/sisong/orgs","repos_url":"https://api.github.com/users/sisong/repos","events_url":"https://api.github.com/users/sisong/events{/privacy}","received_events_url":"https://api.github.com/users/sisong/received_events","type":"User","site_admin":false},"labels":[],"state":"open","locked":false,"assignee":null,"assignees":[],"milestone":null,"comments":3,"created_at":"2024-01-03T05:56:31Z","updated_at":"2024-03-21T14:18:35Z","closed_at":null,"author_association":"NONE","active_lock_reason":null,"body":"Thank you for sharing the libdeflate, it's great! \nMy project want run on phone, so I add some API to libdeflate for support compress&decompress by stream (ref #19), & support compress by multi-thread (ref #40); \nAnd at the same time, try to keep it simple and fast. \nWith these modifications at [stream_mt](https://github.com/sisong/libdeflate/tree/stream-mt), I rewrote gzip.c to pgzip.c for testing stream and multi-thread parallel. \n \nit's can run ok when compression_level<=9, but got a bad compress ratio when compression_level>=10; Because I don't know how to rebuild the hash dictionary for bt_matchfinder. \nI added func bt_matchfinder_skip_bytes(), it only simple loop call bt_matchfinder_skip_byte(), so it's fail. \nI need some help, How to implement bt_matchfinder_skip_bytes()? it's similar ht_matchfinder_skip_bytes() or hc_matchfinder_skip_bytes(). \n \n\ncurrent work progress, some files for compression testing: \n| |file name|file original size|\n|----:|:----|----:|\n|1|Chrome_107.0.5304.122-x64-Stable.win.tar|278658560|\n|2|Emacs_28.2-universal.mac.tar|196380160|\n|3|gcc_12.2.0.src.tar|865884160|\n|4|jdk_x64_mac_openj9_16.0.1_9_openj9-0.26.0.tar|363765760|\n|5|linux_5.19.9.src.tar|1269637120|\n \n**test PC**: Windows11, CPU R9-7945HX, SSD PCIe4.0x4 4T, DDR5 5200MHz 32Gx2 \n**Program version**: zlib v1.2.13, gzip in libdeflate v1.19, pgzip in [stream_mt](https://github.com/sisong/libdeflate/tree/stream-mt) based on libdeflate v1.19 \nOnly test deflate compress & decompress, no crc; build by vc2022; The time counted includes the time of read & write file data; `-p-16` means compressor run with 16 threads. \n|Program|C ratio|C ave. mem|C ave. speed|D ave. mem|D max mem|D ave. speed|\n|:----|----:|----:|----:|----:|----:|----:|\n|zlib-1|29.981%|2M|163MB/s|1M|1M|531MB/s|\n|zlib-2|29.077%|2M|151MB/s|1M|1M|548MB/s|\n|zlib-3|28.402%|2M|124MB/s|1M|1M|558MB/s|\n|zlib-4|27.147%|2M|108MB/s|1M|1M|556MB/s|\n|zlib-5|26.442%|2M|84MB/s|1M|1M|570MB/s|\n|zlib-6|26.077%|2M|58MB/s|1M|1M|574MB/s|\n|zlib-7|25.972%|2M|48MB/s|1M|1M|576MB/s|\n|zlib-8|25.879%|2M|32MB/s|1M|1M|589MB/s|\n|zlib-9|25.852%|2M|27MB/s|1M|1M|589MB/s|\n||\n|gzip -1|28.325%|571M|342MB/s|569M|1214M|692MB/s|\n|gzip -2|27.465%|571M|254MB/s|569M|1214M|703MB/s|\n|gzip -3|27.030%|571M|234MB/s|569M|1214M|714MB/s|\n|gzip -4|26.740%|571M|217MB/s|569M|1214M|708MB/s|\n|gzip -5|26.390%|571M|193MB/s|569M|1214M|719MB/s|\n|gzip -6|26.096%|571M|156MB/s|569M|1214M|723MB/s|\n|gzip -7|25.956%|571M|113MB/s|569M|1214M|718MB/s|\n|gzip -8|25.861%|571M|69MB/s|569M|1214M|721MB/s|\n|gzip -9|25.847%|571M|55MB/s|569M|1214M|722MB/s|\n||\n|pgzip -1 -p-1|28.325%|5M|380MB/s|33M|33M|999MB/s|\n|pgzip -2 -p-1|27.466%|5M|274MB/s|33M|33M|1009MB/s|\n|pgzip -3 -p-1|27.030%|5M|252MB/s|33M|33M|1022MB/s|\n|pgzip -4 -p-1|26.740%|5M|233MB/s|33M|33M|1026MB/s|\n|pgzip -5 -p-1|26.390%|5M|201MB/s|33M|33M|1028MB/s|\n|pgzip -6 -p-1|26.096%|5M|161MB/s|33M|33M|1032MB/s|\n|pgzip -7 -p-1|25.956%|5M|115MB/s|33M|33M|1040MB/s|\n|pgzip -8 -p-1|25.861%|5M|71MB/s|33M|33M|1024MB/s|\n|pgzip -9 -p-1|25.846%|5M|56MB/s|33M|33M|1034MB/s|\n|pgzip -1 -p-4|28.326%|26M|1415MB/s|33M|33M|999MB/s|\n|pgzip -2 -p-4|27.466%|28M|1045MB/s|33M|33M|1011MB/s|\n|pgzip -3 -p-4|27.030%|28M|948MB/s|33M|33M|1011MB/s|\n|pgzip -4 -p-4|26.740%|28M|878MB/s|33M|33M|1023MB/s|\n|pgzip -5 -p-4|26.390%|28M|763MB/s|33M|33M|1033MB/s|\n|pgzip -6 -p-4|26.097%|28M|611MB/s|33M|33M|1034MB/s|\n|pgzip -7 -p-4|25.956%|28M|442MB/s|33M|33M|1041MB/s|\n|pgzip -8 -p-4|25.861%|28M|272MB/s|33M|33M|1012MB/s|\n|pgzip -9 -p-4|25.847%|28M|216MB/s|33M|33M|1010MB/s|\n|pgzip -1 -p-16|28.326%|101M|3833MB/s|33M|33M|968MB/s|\n|pgzip -2 -p-16|27.466%|108M|2995MB/s|33M|33M|977MB/s|\n|pgzip -3 -p-16|27.030%|108M|2859MB/s|33M|33M|984MB/s|\n|pgzip -4 -p-16|26.740%|108M|2646MB/s|33M|33M|978MB/s|\n|pgzip -5 -p-16|26.390%|108M|2344MB/s|33M|33M|999MB/s|\n|pgzip -6 -p-16|26.097%|108M|2005MB/s|33M|33M|1006MB/s|\n|pgzip -7 -p-16|25.956%|108M|1480MB/s|33M|33M|1006MB/s|\n|pgzip -8 -p-16|25.861%|108M|899MB/s|33M|33M|992MB/s|\n|pgzip -9 -p-16|25.847%|108M|696MB/s|33M|33M|993MB/s|","reactions":{"url":"https://api.github.com/repos/ebiggers/libdeflate/issues/335/reactions","total_count":0,"+1":0,"-1":0,"laugh":0,"hooray":0,"confused":0,"heart":0,"rocket":0,"eyes":0},"timeline_url":"https://api.github.com/repos/ebiggers/libdeflate/issues/335/timeline","performed_via_github_app":null,"state_reason":null},"comment":{"url":"https://api.github.com/repos/ebiggers/libdeflate/issues/comments/2012420667","html_url":"https://github.com/ebiggers/libdeflate/issues/335#issuecomment-2012420667","issue_url":"https://api.github.com/repos/ebiggers/libdeflate/issues/335","id":2012420667,"node_id":"IC_kwDOAbO0QM538xo7","user":{"login":"ghuls","id":1299177,"node_id":"MDQ6VXNlcjEyOTkxNzc=","avatar_url":"https://avatars.githubusercontent.com/u/1299177?v=4","gravatar_id":"","url":"https://api.github.com/users/ghuls","html_url":"https://github.com/ghuls","followers_url":"https://api.github.com/users/ghuls/followers","following_url":"https://api.github.com/users/ghuls/following{/other_user}","gists_url":"https://api.github.com/users/ghuls/gists{/gist_id}","starred_url":"https://api.github.com/users/ghuls/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/ghuls/subscriptions","organizations_url":"https://api.github.com/users/ghuls/orgs","repos_url":"https://api.github.com/users/ghuls/repos","events_url":"https://api.github.com/users/ghuls/events{/privacy}","received_events_url":"https://api.github.com/users/ghuls/received_events","type":"User","site_admin":false},"created_at":"2024-03-21T14:18:35Z","updated_at":"2024-03-21T14:18:35Z","author_association":"NONE","body":"libdeflate-pgzip streaming decompression, beats igzip (from ISA-L) decompression speed (the latter is/was the fastest gzip streaming decompressor that I know of). Quite impressive!\r\n\r\n```\r\n$ /software/libdeflate_streaming/build/programs/libdeflate-pgzip -d -c fragments.tsv.gz > /dev/null\r\n\r\nTime output:\r\n------------\r\n\r\n * Command: /software/libdeflate_streaming/build/programs/libdeflate-pgzip -d -c fragments.tsv.gz\r\n * Elapsed wall time: 0:05.60 = 5.60 seconds\r\n * Elapsed CPU time:\r\n - User: 5.34\r\n - Sys: 0.24\r\n * CPU usage: 99%\r\n * Context switching:\r\n - Voluntarily (e.g.: waiting for I/O operation): 5\r\n - Involuntarily (time slice expired): 27\r\n * Maximum resident set size (RSS: memory) (kiB): 29284\r\n * Number of times the process was swapped out of main memory: 0\r\n * Filesystem:\r\n - # of inputs: 0\r\n - # of outputs: 0\r\n * Exit status: 0\r\n\r\n\r\n$ timeit /software/libdeflate_streaming/build/programs/libdeflate-gzip -d -c fragments.tsv.gz > /dev/null\r\n\r\nTime output:\r\n------------\r\n\r\n * Command: /software/libdeflate_streaming/build/programs/libdeflate-gzip -d -c fragments.tsv.gz\r\n * Elapsed wall time: 0:12.22 = 12.22 seconds\r\n * Elapsed CPU time:\r\n - User: 10.81\r\n - Sys: 1.38\r\n * CPU usage: 99%\r\n * Context switching:\r\n - Voluntarily (e.g.: waiting for I/O operation): 3\r\n - Involuntarily (time slice expired): 77\r\n * Maximum resident set size (RSS: memory) (kiB): 5744956\r\n * Number of times the process was swapped out of main memory: 0\r\n * Filesystem:\r\n - # of inputs: 0\r\n - # of outputs: 0\r\n * Exit status: 0\r\n\r\n\r\n$ timeit /software/isa-l/programs/igzip -d -c fragments.tsv.gz > /dev/null\r\n\r\nTime output:\r\n------------\r\n\r\n * Command: /software/isa-l/programs/igzip -d -c fragments.tsv.gz\r\n * Elapsed wall time: 0:07.32 = 7.32 seconds\r\n * Elapsed CPU time:\r\n - User: 7.10\r\n - Sys: 0.20\r\n * CPU usage: 99%\r\n * Context switching:\r\n - Voluntarily (e.g.: waiting for I/O operation): 78\r\n - Involuntarily (time slice expired): 62\r\n * Maximum resident set size (RSS: memory) (kiB): 4176\r\n * Number of times the process was swapped out of main memory: 0\r\n * Filesystem:\r\n - # of inputs: 888\r\n - # of outputs: 0\r\n * Exit status: 0\r\n```\r\n\r\nOne thing that doesn't work is decompression gzip files with multiple gzip headers, as only the first gzip stream is decompressed. `bgzip` of HTSlib makes block gzipped files like this and is used a lot in bioinformatics to compress files.\r\n\r\n```bash\r\necho \"1\" | gzip > multiple_gzip_header.gz\r\necho \"2\" | gzip >> multiple_gzip_header.gz\r\necho \"3\" | gzip >> multiple_gzip_header.gz\r\n\r\n$ /software/libdeflate_streaming/build/programs/libdeflate-pgzip -cd multiple_gzip_header.gz\r\n1\r\n\r\n# Standard gzip does not have problems with those files.\r\n$ zcat multiple_gzip_header.gz\r\n1\r\n2\r\n3\r\n\r\n\r\n# igzip of ISA-L works too.\r\n$ /software/isa-l/programs/igzip -cd multiple_gzip_header.gz\r\n1\r\n2\r\n3\r\n```\r\n\r\n\r\n","reactions":{"url":"https://api.github.com/repos/ebiggers/libdeflate/issues/comments/2012420667/reactions","total_count":0,"+1":0,"-1":0,"laugh":0,"hooray":0,"confused":0,"heart":0,"rocket":0,"eyes":0},"performed_via_github_app":null}},"public":true,"created_at":"2024-03-21T14:18:36Z"},{"id":"36756012349","type":"IssueCommentEvent","actor":{"id":1299177,"login":"ghuls","display_login":"ghuls","gravatar_id":"","url":"https://api.github.com/users/ghuls","avatar_url":"https://avatars.githubusercontent.com/u/1299177?"},"repo":{"id":28554304,"name":"ebiggers/libdeflate","url":"https://api.github.com/repos/ebiggers/libdeflate"},"payload":{"action":"created","issue":{"url":"https://api.github.com/repos/ebiggers/libdeflate/issues/335","repository_url":"https://api.github.com/repos/ebiggers/libdeflate","labels_url":"https://api.github.com/repos/ebiggers/libdeflate/issues/335/labels{/name}","comments_url":"https://api.github.com/repos/ebiggers/libdeflate/issues/335/comments","events_url":"https://api.github.com/repos/ebiggers/libdeflate/issues/335/events","html_url":"https://github.com/ebiggers/libdeflate/issues/335","id":2063286720,"node_id":"I_kwDOAbO0QM56-0HA","number":335,"title":"I added stream & multi-thread support for libdeflate, need help!","user":{"login":"sisong","id":2347214,"node_id":"MDQ6VXNlcjIzNDcyMTQ=","avatar_url":"https://avatars.githubusercontent.com/u/2347214?v=4","gravatar_id":"","url":"https://api.github.com/users/sisong","html_url":"https://github.com/sisong","followers_url":"https://api.github.com/users/sisong/followers","following_url":"https://api.github.com/users/sisong/following{/other_user}","gists_url":"https://api.github.com/users/sisong/gists{/gist_id}","starred_url":"https://api.github.com/users/sisong/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/sisong/subscriptions","organizations_url":"https://api.github.com/users/sisong/orgs","repos_url":"https://api.github.com/users/sisong/repos","events_url":"https://api.github.com/users/sisong/events{/privacy}","received_events_url":"https://api.github.com/users/sisong/received_events","type":"User","site_admin":false},"labels":[],"state":"open","locked":false,"assignee":null,"assignees":[],"milestone":null,"comments":2,"created_at":"2024-01-03T05:56:31Z","updated_at":"2024-03-21T10:46:00Z","closed_at":null,"author_association":"NONE","active_lock_reason":null,"body":"Thank you for sharing the libdeflate, it's great! \nMy project want run on phone, so I add some API to libdeflate for support compress&decompress by stream (ref #19), & support compress by multi-thread (ref #40); \nAnd at the same time, try to keep it simple and fast. \nWith these modifications at [stream_mt](https://github.com/sisong/libdeflate/tree/stream-mt), I rewrote gzip.c to pgzip.c for testing stream and multi-thread parallel. \n \nit's can run ok when compression_level<=9, but got a bad compress ratio when compression_level>=10; Because I don't know how to rebuild the hash dictionary for bt_matchfinder. \nI added func bt_matchfinder_skip_bytes(), it only simple loop call bt_matchfinder_skip_byte(), so it's fail. \nI need some help, How to implement bt_matchfinder_skip_bytes()? it's similar ht_matchfinder_skip_bytes() or hc_matchfinder_skip_bytes(). \n \n\ncurrent work progress, some files for compression testing: \n| |file name|file original size|\n|----:|:----|----:|\n|1|Chrome_107.0.5304.122-x64-Stable.win.tar|278658560|\n|2|Emacs_28.2-universal.mac.tar|196380160|\n|3|gcc_12.2.0.src.tar|865884160|\n|4|jdk_x64_mac_openj9_16.0.1_9_openj9-0.26.0.tar|363765760|\n|5|linux_5.19.9.src.tar|1269637120|\n \n**test PC**: Windows11, CPU R9-7945HX, SSD PCIe4.0x4 4T, DDR5 5200MHz 32Gx2 \n**Program version**: zlib v1.2.13, gzip in libdeflate v1.19, pgzip in [stream_mt](https://github.com/sisong/libdeflate/tree/stream-mt) based on libdeflate v1.19 \nOnly test deflate compress & decompress, no crc; build by vc2022; The time counted includes the time of read & write file data; `-p-16` means compressor run with 16 threads. \n|Program|C ratio|C ave. mem|C ave. speed|D ave. mem|D max mem|D ave. speed|\n|:----|----:|----:|----:|----:|----:|----:|\n|zlib-1|29.981%|2M|163MB/s|1M|1M|531MB/s|\n|zlib-2|29.077%|2M|151MB/s|1M|1M|548MB/s|\n|zlib-3|28.402%|2M|124MB/s|1M|1M|558MB/s|\n|zlib-4|27.147%|2M|108MB/s|1M|1M|556MB/s|\n|zlib-5|26.442%|2M|84MB/s|1M|1M|570MB/s|\n|zlib-6|26.077%|2M|58MB/s|1M|1M|574MB/s|\n|zlib-7|25.972%|2M|48MB/s|1M|1M|576MB/s|\n|zlib-8|25.879%|2M|32MB/s|1M|1M|589MB/s|\n|zlib-9|25.852%|2M|27MB/s|1M|1M|589MB/s|\n||\n|gzip -1|28.325%|571M|342MB/s|569M|1214M|692MB/s|\n|gzip -2|27.465%|571M|254MB/s|569M|1214M|703MB/s|\n|gzip -3|27.030%|571M|234MB/s|569M|1214M|714MB/s|\n|gzip -4|26.740%|571M|217MB/s|569M|1214M|708MB/s|\n|gzip -5|26.390%|571M|193MB/s|569M|1214M|719MB/s|\n|gzip -6|26.096%|571M|156MB/s|569M|1214M|723MB/s|\n|gzip -7|25.956%|571M|113MB/s|569M|1214M|718MB/s|\n|gzip -8|25.861%|571M|69MB/s|569M|1214M|721MB/s|\n|gzip -9|25.847%|571M|55MB/s|569M|1214M|722MB/s|\n||\n|pgzip -1 -p-1|28.325%|5M|380MB/s|33M|33M|999MB/s|\n|pgzip -2 -p-1|27.466%|5M|274MB/s|33M|33M|1009MB/s|\n|pgzip -3 -p-1|27.030%|5M|252MB/s|33M|33M|1022MB/s|\n|pgzip -4 -p-1|26.740%|5M|233MB/s|33M|33M|1026MB/s|\n|pgzip -5 -p-1|26.390%|5M|201MB/s|33M|33M|1028MB/s|\n|pgzip -6 -p-1|26.096%|5M|161MB/s|33M|33M|1032MB/s|\n|pgzip -7 -p-1|25.956%|5M|115MB/s|33M|33M|1040MB/s|\n|pgzip -8 -p-1|25.861%|5M|71MB/s|33M|33M|1024MB/s|\n|pgzip -9 -p-1|25.846%|5M|56MB/s|33M|33M|1034MB/s|\n|pgzip -1 -p-4|28.326%|26M|1415MB/s|33M|33M|999MB/s|\n|pgzip -2 -p-4|27.466%|28M|1045MB/s|33M|33M|1011MB/s|\n|pgzip -3 -p-4|27.030%|28M|948MB/s|33M|33M|1011MB/s|\n|pgzip -4 -p-4|26.740%|28M|878MB/s|33M|33M|1023MB/s|\n|pgzip -5 -p-4|26.390%|28M|763MB/s|33M|33M|1033MB/s|\n|pgzip -6 -p-4|26.097%|28M|611MB/s|33M|33M|1034MB/s|\n|pgzip -7 -p-4|25.956%|28M|442MB/s|33M|33M|1041MB/s|\n|pgzip -8 -p-4|25.861%|28M|272MB/s|33M|33M|1012MB/s|\n|pgzip -9 -p-4|25.847%|28M|216MB/s|33M|33M|1010MB/s|\n|pgzip -1 -p-16|28.326%|101M|3833MB/s|33M|33M|968MB/s|\n|pgzip -2 -p-16|27.466%|108M|2995MB/s|33M|33M|977MB/s|\n|pgzip -3 -p-16|27.030%|108M|2859MB/s|33M|33M|984MB/s|\n|pgzip -4 -p-16|26.740%|108M|2646MB/s|33M|33M|978MB/s|\n|pgzip -5 -p-16|26.390%|108M|2344MB/s|33M|33M|999MB/s|\n|pgzip -6 -p-16|26.097%|108M|2005MB/s|33M|33M|1006MB/s|\n|pgzip -7 -p-16|25.956%|108M|1480MB/s|33M|33M|1006MB/s|\n|pgzip -8 -p-16|25.861%|108M|899MB/s|33M|33M|992MB/s|\n|pgzip -9 -p-16|25.847%|108M|696MB/s|33M|33M|993MB/s|","reactions":{"url":"https://api.github.com/repos/ebiggers/libdeflate/issues/335/reactions","total_count":0,"+1":0,"-1":0,"laugh":0,"hooray":0,"confused":0,"heart":0,"rocket":0,"eyes":0},"timeline_url":"https://api.github.com/repos/ebiggers/libdeflate/issues/335/timeline","performed_via_github_app":null,"state_reason":null},"comment":{"url":"https://api.github.com/repos/ebiggers/libdeflate/issues/comments/2011907918","html_url":"https://github.com/ebiggers/libdeflate/issues/335#issuecomment-2011907918","issue_url":"https://api.github.com/repos/ebiggers/libdeflate/issues/335","id":2011907918,"node_id":"IC_kwDOAbO0QM5360dO","user":{"login":"ghuls","id":1299177,"node_id":"MDQ6VXNlcjEyOTkxNzc=","avatar_url":"https://avatars.githubusercontent.com/u/1299177?v=4","gravatar_id":"","url":"https://api.github.com/users/ghuls","html_url":"https://github.com/ghuls","followers_url":"https://api.github.com/users/ghuls/followers","following_url":"https://api.github.com/users/ghuls/following{/other_user}","gists_url":"https://api.github.com/users/ghuls/gists{/gist_id}","starred_url":"https://api.github.com/users/ghuls/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/ghuls/subscriptions","organizations_url":"https://api.github.com/users/ghuls/orgs","repos_url":"https://api.github.com/users/ghuls/repos","events_url":"https://api.github.com/users/ghuls/events{/privacy}","received_events_url":"https://api.github.com/users/ghuls/received_events","type":"User","site_admin":false},"created_at":"2024-03-21T10:45:59Z","updated_at":"2024-03-21T10:45:59Z","author_association":"NONE","body":"@sisong How do I compile your `pgzip` binary? I don't see it exposed in the CMake config","reactions":{"url":"https://api.github.com/repos/ebiggers/libdeflate/issues/comments/2011907918/reactions","total_count":0,"+1":0,"-1":0,"laugh":0,"hooray":0,"confused":0,"heart":0,"rocket":0,"eyes":0},"performed_via_github_app":null}},"public":true,"created_at":"2024-03-21T10:46:00Z"},{"id":"36732231294","type":"IssueCommentEvent","actor":{"id":1299177,"login":"ghuls","display_login":"ghuls","gravatar_id":"","url":"https://api.github.com/users/ghuls","avatar_url":"https://avatars.githubusercontent.com/u/1299177?"},"repo":{"id":30698177,"name":"rust-bio/rust-htslib","url":"https://api.github.com/repos/rust-bio/rust-htslib"},"payload":{"action":"created","issue":{"url":"https://api.github.com/repos/rust-bio/rust-htslib/issues/425","repository_url":"https://api.github.com/repos/rust-bio/rust-htslib","labels_url":"https://api.github.com/repos/rust-bio/rust-htslib/issues/425/labels{/name}","comments_url":"https://api.github.com/repos/rust-bio/rust-htslib/issues/425/comments","events_url":"https://api.github.com/repos/rust-bio/rust-htslib/issues/425/events","html_url":"https://github.com/rust-bio/rust-htslib/issues/425","id":2194294851,"node_id":"I_kwDOAdRqwc6CykhD","number":425,"title":"Compiler issue on AArch64 Linux","user":{"login":"jakobnissen","id":23193950,"node_id":"MDQ6VXNlcjIzMTkzOTUw","avatar_url":"https://avatars.githubusercontent.com/u/23193950?v=4","gravatar_id":"","url":"https://api.github.com/users/jakobnissen","html_url":"https://github.com/jakobnissen","followers_url":"https://api.github.com/users/jakobnissen/followers","following_url":"https://api.github.com/users/jakobnissen/following{/other_user}","gists_url":"https://api.github.com/users/jakobnissen/gists{/gist_id}","starred_url":"https://api.github.com/users/jakobnissen/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/jakobnissen/subscriptions","organizations_url":"https://api.github.com/users/jakobnissen/orgs","repos_url":"https://api.github.com/users/jakobnissen/repos","events_url":"https://api.github.com/users/jakobnissen/events{/privacy}","received_events_url":"https://api.github.com/users/jakobnissen/received_events","type":"User","site_admin":false},"labels":[],"state":"open","locked":false,"assignee":null,"assignees":[],"milestone":null,"comments":1,"created_at":"2024-03-19T08:07:09Z","updated_at":"2024-03-20T17:48:20Z","closed_at":null,"author_association":"NONE","active_lock_reason":null,"body":"This error occurs only on Linux and ARMv8. It works fine on M1 MacOS. Error:\r\n```\r\n[08:00:56] Compiling rust-htslib v0.44.1\r\n[08:00:57] error[E0308]: mismatched types\r\n[08:00:57] --> /opt/x86_64-linux-musl/registry/src/index.crates.io-6f17d22bba15001f/rust-htslib-0.44.1/src/bam/record.rs:2324:17\r\n[08:00:57] |\r\n[08:00:57] 2319 | let ret = hts_sys::bam_mods_query_type(\r\n[08:00:57] | ---------------------------- arguments to this function are incorrect\r\n[08:00:57] ...\r\n[08:00:57] 2324 | &mut canonical,\r\n[08:00:57] | ^^^^^^^^^^^^^^ expected `*mut u8`, found `&mut i8`\r\n[08:00:57] |\r\n[08:00:57] = note: expected raw pointer `*mut u8`\r\n[08:00:57] found mutable reference `&mut i8`\r\n[08:00:57] note: function defined here\r\n[08:00:57] --> /workspace/srcdir/CoverM/target/aarch64-unknown-linux-musl/release/build/hts-sys-9926615689a35892/out/bindings.rs:9439:12\r\n[08:00:57] |\r\n[08:00:57] 9439 | pub fn bam_mods_query_type(\r\n[08:00:57] | ^^^^^^^^^^^^^^^^^^^\r\n[08:00:57] \r\n[08:00:57] For more information about this error, try `rustc --explain E0308`.\r\n```\r\nThis appears to be the same issue as #352 - however, notice that the error occurs on rust-htslib 0.44.1, where the error ought to be fixed. Maybe the fix didn't get in properly?","reactions":{"url":"https://api.github.com/repos/rust-bio/rust-htslib/issues/425/reactions","total_count":0,"+1":0,"-1":0,"laugh":0,"hooray":0,"confused":0,"heart":0,"rocket":0,"eyes":0},"timeline_url":"https://api.github.com/repos/rust-bio/rust-htslib/issues/425/timeline","performed_via_github_app":null,"state_reason":null},"comment":{"url":"https://api.github.com/repos/rust-bio/rust-htslib/issues/comments/2010238519","html_url":"https://github.com/rust-bio/rust-htslib/issues/425#issuecomment-2010238519","issue_url":"https://api.github.com/repos/rust-bio/rust-htslib/issues/425","id":2010238519,"node_id":"IC_kwDOAdRqwc530c43","user":{"login":"ghuls","id":1299177,"node_id":"MDQ6VXNlcjEyOTkxNzc=","avatar_url":"https://avatars.githubusercontent.com/u/1299177?v=4","gravatar_id":"","url":"https://api.github.com/users/ghuls","html_url":"https://github.com/ghuls","followers_url":"https://api.github.com/users/ghuls/followers","following_url":"https://api.github.com/users/ghuls/following{/other_user}","gists_url":"https://api.github.com/users/ghuls/gists{/gist_id}","starred_url":"https://api.github.com/users/ghuls/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/ghuls/subscriptions","organizations_url":"https://api.github.com/users/ghuls/orgs","repos_url":"https://api.github.com/users/ghuls/repos","events_url":"https://api.github.com/users/ghuls/events{/privacy}","received_events_url":"https://api.github.com/users/ghuls/received_events","type":"User","site_admin":false},"created_at":"2024-03-20T17:48:18Z","updated_at":"2024-03-20T17:48:18Z","author_association":"NONE","body":"@jakobnissen I think it was recently fixed in `master`: https://github.com/rust-bio/rust-htslib/pull/415","reactions":{"url":"https://api.github.com/repos/rust-bio/rust-htslib/issues/comments/2010238519/reactions","total_count":0,"+1":0,"-1":0,"laugh":0,"hooray":0,"confused":0,"heart":0,"rocket":0,"eyes":0},"performed_via_github_app":null}},"public":true,"created_at":"2024-03-20T17:48:20Z","org":{"id":13785584,"login":"rust-bio","gravatar_id":"","url":"https://api.github.com/orgs/rust-bio","avatar_url":"https://avatars.githubusercontent.com/u/13785584?"}},{"id":"36720544229","type":"PushEvent","actor":{"id":1299177,"login":"ghuls","display_login":"ghuls","gravatar_id":"","url":"https://api.github.com/users/ghuls","avatar_url":"https://avatars.githubusercontent.com/u/1299177?"},"repo":{"id":329905726,"name":"aertslab/pycisTopic","url":"https://api.github.com/repos/aertslab/pycisTopic"},"payload":{"repository_id":329905726,"push_id":17626788267,"size":1,"distinct_size":1,"ref":"refs/heads/polars","head":"ab8934893435e4117714af23be5e333aaa5d5586","before":"67eabaa75c9c23999b210e672c1e809b88ca80e7","commits":[{"sha":"ab8934893435e4117714af23be5e333aaa5d5586","author":{"email":"gert.hulselmans@kuleuven.be","name":"Gert Hulselmans"},"message":"Fix deprecation warnings from Polars 0.20.x in fragments.py and genomic_ranges.py.\n\nFix deprecation warnings from Polars 0.20.x in fragments.py and\ngenomic_ranges.py.","distinct":true,"url":"https://api.github.com/repos/aertslab/pycisTopic/commits/ab8934893435e4117714af23be5e333aaa5d5586"}]},"public":true,"created_at":"2024-03-20T12:35:54Z","org":{"id":3940817,"login":"aertslab","gravatar_id":"","url":"https://api.github.com/orgs/aertslab","avatar_url":"https://avatars.githubusercontent.com/u/3940817?"}},{"id":"36692932805","type":"IssueCommentEvent","actor":{"id":1299177,"login":"ghuls","display_login":"ghuls","gravatar_id":"","url":"https://api.github.com/users/ghuls","avatar_url":"https://avatars.githubusercontent.com/u/1299177?"},"repo":{"id":263727855,"name":"pola-rs/polars","url":"https://api.github.com/repos/pola-rs/polars"},"payload":{"action":"created","issue":{"url":"https://api.github.com/repos/pola-rs/polars/issues/4950","repository_url":"https://api.github.com/repos/pola-rs/polars","labels_url":"https://api.github.com/repos/pola-rs/polars/issues/4950/labels{/name}","comments_url":"https://api.github.com/repos/pola-rs/polars/issues/4950/comments","events_url":"https://api.github.com/repos/pola-rs/polars/issues/4950/events","html_url":"https://github.com/pola-rs/polars/issues/4950","id":1383732732,"node_id":"I_kwDOD7gq785SehX8","number":4950,"title":"Add BytesIO support to `scan_csv`","user":{"login":"nebfield","id":11425618,"node_id":"MDQ6VXNlcjExNDI1NjE4","avatar_url":"https://avatars.githubusercontent.com/u/11425618?v=4","gravatar_id":"","url":"https://api.github.com/users/nebfield","html_url":"https://github.com/nebfield","followers_url":"https://api.github.com/users/nebfield/followers","following_url":"https://api.github.com/users/nebfield/following{/other_user}","gists_url":"https://api.github.com/users/nebfield/gists{/gist_id}","starred_url":"https://api.github.com/users/nebfield/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/nebfield/subscriptions","organizations_url":"https://api.github.com/users/nebfield/orgs","repos_url":"https://api.github.com/users/nebfield/repos","events_url":"https://api.github.com/users/nebfield/events{/privacy}","received_events_url":"https://api.github.com/users/nebfield/received_events","type":"User","site_admin":false},"labels":[{"id":3433894133,"node_id":"LA_kwDOD7gq787MrRD1","url":"https://api.github.com/repos/pola-rs/polars/labels/enhancement","name":"enhancement","color":"FEF2C0","default":true,"description":"New feature or an improvement of an existing feature"},{"id":6458990294,"node_id":"LA_kwDOD7gq788AAAABgPxe1g","url":"https://api.github.com/repos/pola-rs/polars/labels/A-io-csv","name":"A-io-csv","color":"D4C5F9","default":false,"description":"Area: reading/writing CSV files"}],"state":"open","locked":false,"assignee":null,"assignees":[],"milestone":null,"comments":13,"created_at":"2022-09-23T12:30:09Z","updated_at":"2024-03-19T17:00:39Z","closed_at":null,"author_association":"NONE","active_lock_reason":null,"body":"### Problem Description\r\n\r\nFirstly, thank you for making this fantastic library 😀 \r\n\r\nI have the following use case:\r\n\r\n```\r\nimport zstandard\r\nimport polars as pl\r\n\r\nwith open(path, 'rb') as f:\r\n dctx = zstandard.ZstdDecompressor()\r\n with dctx.stream_reader(f) as reader:\r\n df = pl.read_csv(reader, sep='\\t')\r\n```\r\n\r\nWhere `path` is the path of a Zstandard compressed TSV file. I'm working with bioinformatics data, and bioinformaticians love to generate massive CSVs/TSVs and then compress them.\r\n\r\nI would like to use `scan_csv` to read the decompressed BytesIO stream instead and take advantage of all the cool lazy evaluation features to reduce memory usage. Alternatively, it would be great if `scan_csv` supported Zstandard compressed file paths directly. \r\n\r\nThanks for your time!","reactions":{"url":"https://api.github.com/repos/pola-rs/polars/issues/4950/reactions","total_count":8,"+1":8,"-1":0,"laugh":0,"hooray":0,"confused":0,"heart":0,"rocket":0,"eyes":0},"timeline_url":"https://api.github.com/repos/pola-rs/polars/issues/4950/timeline","performed_via_github_app":null,"state_reason":null},"comment":{"url":"https://api.github.com/repos/pola-rs/polars/issues/comments/2007692412","html_url":"https://github.com/pola-rs/polars/issues/4950#issuecomment-2007692412","issue_url":"https://api.github.com/repos/pola-rs/polars/issues/4950","id":2007692412,"node_id":"IC_kwDOD7gq7853qvR8","user":{"login":"ghuls","id":1299177,"node_id":"MDQ6VXNlcjEyOTkxNzc=","avatar_url":"https://avatars.githubusercontent.com/u/1299177?v=4","gravatar_id":"","url":"https://api.github.com/users/ghuls","html_url":"https://github.com/ghuls","followers_url":"https://api.github.com/users/ghuls/followers","following_url":"https://api.github.com/users/ghuls/following{/other_user}","gists_url":"https://api.github.com/users/ghuls/gists{/gist_id}","starred_url":"https://api.github.com/users/ghuls/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/ghuls/subscriptions","organizations_url":"https://api.github.com/users/ghuls/orgs","repos_url":"https://api.github.com/users/ghuls/repos","events_url":"https://api.github.com/users/ghuls/events{/privacy}","received_events_url":"https://api.github.com/users/ghuls/received_events","type":"User","site_admin":false},"created_at":"2024-03-19T17:00:37Z","updated_at":"2024-03-19T17:00:37Z","author_association":"COLLABORATOR","body":"> > You can use `parquet-fromcsv` in the meantime to convert compressed CSV/TSV files to parquet and use `pl.scan_parquet` on them: [#9283 (comment)](https://github.com/pola-rs/polars/issues/9283#issuecomment-1594512827)\r\n> \r\n> I think I'm understanding correctly that this is a recommendation to use a rust library. Any advice for the less-cool among us who are still working pretty exclusively from a python environment?\r\n\r\nIt is a command line tool, but part of the rust `arrow` library.","reactions":{"url":"https://api.github.com/repos/pola-rs/polars/issues/comments/2007692412/reactions","total_count":0,"+1":0,"-1":0,"laugh":0,"hooray":0,"confused":0,"heart":0,"rocket":0,"eyes":0},"performed_via_github_app":null}},"public":true,"created_at":"2024-03-19T17:00:39Z","org":{"id":83768144,"login":"pola-rs","gravatar_id":"","url":"https://api.github.com/orgs/pola-rs","avatar_url":"https://avatars.githubusercontent.com/u/83768144?"}},{"id":"36650995428","type":"IssueCommentEvent","actor":{"id":1299177,"login":"ghuls","display_login":"ghuls","gravatar_id":"","url":"https://api.github.com/users/ghuls","avatar_url":"https://avatars.githubusercontent.com/u/1299177?"},"repo":{"id":182300329,"name":"jackh726/bigtools","url":"https://api.github.com/repos/jackh726/bigtools"},"payload":{"action":"created","issue":{"url":"https://api.github.com/repos/jackh726/bigtools/issues/28","repository_url":"https://api.github.com/repos/jackh726/bigtools","labels_url":"https://api.github.com/repos/jackh726/bigtools/issues/28/labels{/name}","comments_url":"https://api.github.com/repos/jackh726/bigtools/issues/28/comments","events_url":"https://api.github.com/repos/jackh726/bigtools/issues/28/events","html_url":"https://github.com/jackh726/bigtools/issues/28","id":2186829123,"node_id":"I_kwDOCt2uqc6CWF1D","number":28,"title":"minor pybigtools complaints","user":{"login":"sergpolly","id":6790270,"node_id":"MDQ6VXNlcjY3OTAyNzA=","avatar_url":"https://avatars.githubusercontent.com/u/6790270?v=4","gravatar_id":"","url":"https://api.github.com/users/sergpolly","html_url":"https://github.com/sergpolly","followers_url":"https://api.github.com/users/sergpolly/followers","following_url":"https://api.github.com/users/sergpolly/following{/other_user}","gists_url":"https://api.github.com/users/sergpolly/gists{/gist_id}","starred_url":"https://api.github.com/users/sergpolly/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/sergpolly/subscriptions","organizations_url":"https://api.github.com/users/sergpolly/orgs","repos_url":"https://api.github.com/users/sergpolly/repos","events_url":"https://api.github.com/users/sergpolly/events{/privacy}","received_events_url":"https://api.github.com/users/sergpolly/received_events","type":"User","site_admin":false},"labels":[],"state":"open","locked":false,"assignee":null,"assignees":[],"milestone":null,"comments":1,"created_at":"2024-03-14T16:49:51Z","updated_at":"2024-03-18T15:43:51Z","closed_at":null,"author_association":"NONE","active_lock_reason":null,"body":"pybigtools can feels a bit alien as a python module\r\nwould be nice to have access to things like `__version__`, `__path__` etc etc - e.g. `import pybigtools as pybig; print(pybig.__version__)`","reactions":{"url":"https://api.github.com/repos/jackh726/bigtools/issues/28/reactions","total_count":0,"+1":0,"-1":0,"laugh":0,"hooray":0,"confused":0,"heart":0,"rocket":0,"eyes":0},"timeline_url":"https://api.github.com/repos/jackh726/bigtools/issues/28/timeline","performed_via_github_app":null,"state_reason":null},"comment":{"url":"https://api.github.com/repos/jackh726/bigtools/issues/comments/2004269644","html_url":"https://github.com/jackh726/bigtools/issues/28#issuecomment-2004269644","issue_url":"https://api.github.com/repos/jackh726/bigtools/issues/28","id":2004269644,"node_id":"IC_kwDOCt2uqc53drpM","user":{"login":"ghuls","id":1299177,"node_id":"MDQ6VXNlcjEyOTkxNzc=","avatar_url":"https://avatars.githubusercontent.com/u/1299177?v=4","gravatar_id":"","url":"https://api.github.com/users/ghuls","html_url":"https://github.com/ghuls","followers_url":"https://api.github.com/users/ghuls/followers","following_url":"https://api.github.com/users/ghuls/following{/other_user}","gists_url":"https://api.github.com/users/ghuls/gists{/gist_id}","starred_url":"https://api.github.com/users/ghuls/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/ghuls/subscriptions","organizations_url":"https://api.github.com/users/ghuls/orgs","repos_url":"https://api.github.com/users/ghuls/repos","events_url":"https://api.github.com/users/ghuls/events{/privacy}","received_events_url":"https://api.github.com/users/ghuls/received_events","type":"User","site_admin":false},"created_at":"2024-03-18T15:43:49Z","updated_at":"2024-03-18T15:43:49Z","author_association":"CONTRIBUTOR","body":"`__path__` is already there:\r\n\r\n`__version__` will be there if: https://github.com/jackh726/bigtools/pull/32 gets merged.\r\n\r\n```python\r\nIn [1]: import pybigtools as pybig\r\n\r\nIn [2]: print(pybig.__version__)\r\n0.1.4-dev\r\n\r\nIn [3]: print(pybig.__path__)\r\n['/anaconda3/envs/pybigtools/lib/python3.10/site-packages/pybigtools']\r\n```","reactions":{"url":"https://api.github.com/repos/jackh726/bigtools/issues/comments/2004269644/reactions","total_count":0,"+1":0,"-1":0,"laugh":0,"hooray":0,"confused":0,"heart":0,"rocket":0,"eyes":0},"performed_via_github_app":null}},"public":true,"created_at":"2024-03-18T15:43:51Z"},{"id":"36650882954","type":"PullRequestEvent","actor":{"id":1299177,"login":"ghuls","display_login":"ghuls","gravatar_id":"","url":"https://api.github.com/users/ghuls","avatar_url":"https://avatars.githubusercontent.com/u/1299177?"},"repo":{"id":182300329,"name":"jackh726/bigtools","url":"https://api.github.com/repos/jackh726/bigtools"},"payload":{"action":"opened","number":32,"pull_request":{"url":"https://api.github.com/repos/jackh726/bigtools/pulls/32","id":1777722248,"node_id":"PR_kwDOCt2uqc5p9eOI","html_url":"https://github.com/jackh726/bigtools/pull/32","diff_url":"https://github.com/jackh726/bigtools/pull/32.diff","patch_url":"https://github.com/jackh726/bigtools/pull/32.patch","issue_url":"https://api.github.com/repos/jackh726/bigtools/issues/32","number":32,"state":"open","locked":false,"title":"Add \"__version__\" package attribute","user":{"login":"ghuls","id":1299177,"node_id":"MDQ6VXNlcjEyOTkxNzc=","avatar_url":"https://avatars.githubusercontent.com/u/1299177?v=4","gravatar_id":"","url":"https://api.github.com/users/ghuls","html_url":"https://github.com/ghuls","followers_url":"https://api.github.com/users/ghuls/followers","following_url":"https://api.github.com/users/ghuls/following{/other_user}","gists_url":"https://api.github.com/users/ghuls/gists{/gist_id}","starred_url":"https://api.github.com/users/ghuls/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/ghuls/subscriptions","organizations_url":"https://api.github.com/users/ghuls/orgs","repos_url":"https://api.github.com/users/ghuls/repos","events_url":"https://api.github.com/users/ghuls/events{/privacy}","received_events_url":"https://api.github.com/users/ghuls/received_events","type":"User","site_admin":false},"body":null,"created_at":"2024-03-18T15:41:00Z","updated_at":"2024-03-18T15:41:00Z","closed_at":null,"merged_at":null,"merge_commit_sha":null,"assignee":null,"assignees":[],"requested_reviewers":[],"requested_teams":[],"labels":[],"milestone":null,"draft":false,"commits_url":"https://api.github.com/repos/jackh726/bigtools/pulls/32/commits","review_comments_url":"https://api.github.com/repos/jackh726/bigtools/pulls/32/comments","review_comment_url":"https://api.github.com/repos/jackh726/bigtools/pulls/comments{/number}","comments_url":"https://api.github.com/repos/jackh726/bigtools/issues/32/comments","statuses_url":"https://api.github.com/repos/jackh726/bigtools/statuses/195cc05b7e7de3bf4f57422344e8764530c4be1e","head":{"label":"ghuls:add_version_package_attribute","ref":"add_version_package_attribute","sha":"195cc05b7e7de3bf4f57422344e8764530c4be1e","user":{"login":"ghuls","id":1299177,"node_id":"MDQ6VXNlcjEyOTkxNzc=","avatar_url":"https://avatars.githubusercontent.com/u/1299177?v=4","gravatar_id":"","url":"https://api.github.com/users/ghuls","html_url":"https://github.com/ghuls","followers_url":"https://api.github.com/users/ghuls/followers","following_url":"https://api.github.com/users/ghuls/following{/other_user}","gists_url":"https://api.github.com/users/ghuls/gists{/gist_id}","starred_url":"https://api.github.com/users/ghuls/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/ghuls/subscriptions","organizations_url":"https://api.github.com/users/ghuls/orgs","repos_url":"https://api.github.com/users/ghuls/repos","events_url":"https://api.github.com/users/ghuls/events{/privacy}","received_events_url":"https://api.github.com/users/ghuls/received_events","type":"User","site_admin":false},"repo":{"id":706606164,"node_id":"R_kgDOKh30VA","name":"bigtools","full_name":"ghuls/bigtools","private":false,"owner":{"login":"ghuls","id":1299177,"node_id":"MDQ6VXNlcjEyOTkxNzc=","avatar_url":"https://avatars.githubusercontent.com/u/1299177?v=4","gravatar_id":"","url":"https://api.github.com/users/ghuls","html_url":"https://github.com/ghuls","followers_url":"https://api.github.com/users/ghuls/followers","following_url":"https://api.github.com/users/ghuls/following{/other_user}","gists_url":"https://api.github.com/users/ghuls/gists{/gist_id}","starred_url":"https://api.github.com/users/ghuls/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/ghuls/subscriptions","organizations_url":"https://api.github.com/users/ghuls/orgs","repos_url":"https://api.github.com/users/ghuls/repos","events_url":"https://api.github.com/users/ghuls/events{/privacy}","received_events_url":"https://api.github.com/users/ghuls/received_events","type":"User","site_admin":false},"html_url":"https://github.com/ghuls/bigtools","description":null,"fork":true,"url":"https://api.github.com/repos/ghuls/bigtools","forks_url":"https://api.github.com/repos/ghuls/bigtools/forks","keys_url":"https://api.github.com/repos/ghuls/bigtools/keys{/key_id}","collaborators_url":"https://api.github.com/repos/ghuls/bigtools/collaborators{/collaborator}","teams_url":"https://api.github.com/repos/ghuls/bigtools/teams","hooks_url":"https://api.github.com/repos/ghuls/bigtools/hooks","issue_events_url":"https://api.github.com/repos/ghuls/bigtools/issues/events{/number}","events_url":"https://api.github.com/repos/ghuls/bigtools/events","assignees_url":"https://api.github.com/repos/ghuls/bigtools/assignees{/user}","branches_url":"https://api.github.com/repos/ghuls/bigtools/branches{/branch}","tags_url":"https://api.github.com/repos/ghuls/bigtools/tags","blobs_url":"https://api.github.com/repos/ghuls/bigtools/git/blobs{/sha}","git_tags_url":"https://api.github.com/repos/ghuls/bigtools/git/tags{/sha}","git_refs_url":"https://api.github.com/repos/ghuls/bigtools/git/refs{/sha}","trees_url":"https://api.github.com/repos/ghuls/bigtools/git/trees{/sha}","statuses_url":"https://api.github.com/repos/ghuls/bigtools/statuses/{sha}","languages_url":"https://api.github.com/repos/ghuls/bigtools/languages","stargazers_url":"https://api.github.com/repos/ghuls/bigtools/stargazers","contributors_url":"https://api.github.com/repos/ghuls/bigtools/contributors","subscribers_url":"https://api.github.com/repos/ghuls/bigtools/subscribers","subscription_url":"https://api.github.com/repos/ghuls/bigtools/subscription","commits_url":"https://api.github.com/repos/ghuls/bigtools/commits{/sha}","git_commits_url":"https://api.github.com/repos/ghuls/bigtools/git/commits{/sha}","comments_url":"https://api.github.com/repos/ghuls/bigtools/comments{/number}","issue_comment_url":"https://api.github.com/repos/ghuls/bigtools/issues/comments{/number}","contents_url":"https://api.github.com/repos/ghuls/bigtools/contents/{+path}","compare_url":"https://api.github.com/repos/ghuls/bigtools/compare/{base}...{head}","merges_url":"https://api.github.com/repos/ghuls/bigtools/merges","archive_url":"https://api.github.com/repos/ghuls/bigtools/{archive_format}{/ref}","downloads_url":"https://api.github.com/repos/ghuls/bigtools/downloads","issues_url":"https://api.github.com/repos/ghuls/bigtools/issues{/number}","pulls_url":"https://api.github.com/repos/ghuls/bigtools/pulls{/number}","milestones_url":"https://api.github.com/repos/ghuls/bigtools/milestones{/number}","notifications_url":"https://api.github.com/repos/ghuls/bigtools/notifications{?since,all,participating}","labels_url":"https://api.github.com/repos/ghuls/bigtools/labels{/name}","releases_url":"https://api.github.com/repos/ghuls/bigtools/releases{/id}","deployments_url":"https://api.github.com/repos/ghuls/bigtools/deployments","created_at":"2023-10-18T09:23:53Z","updated_at":"2023-10-18T09:23:53Z","pushed_at":"2024-03-18T15:40:51Z","git_url":"git://github.com/ghuls/bigtools.git","ssh_url":"git@github.com:ghuls/bigtools.git","clone_url":"https://github.com/ghuls/bigtools.git","svn_url":"https://github.com/ghuls/bigtools","homepage":null,"size":4474,"stargazers_count":0,"watchers_count":0,"language":null,"has_issues":false,"has_projects":true,"has_downloads":true,"has_wiki":true,"has_pages":false,"has_discussions":false,"forks_count":0,"mirror_url":null,"archived":false,"disabled":false,"open_issues_count":0,"license":{"key":"mit","name":"MIT License","spdx_id":"MIT","url":"https://api.github.com/licenses/mit","node_id":"MDc6TGljZW5zZTEz"},"allow_forking":true,"is_template":false,"web_commit_signoff_required":false,"topics":[],"visibility":"public","forks":0,"open_issues":0,"watchers":0,"default_branch":"master"}},"base":{"label":"jackh726:master","ref":"master","sha":"0a5107f5bac88d78187814106c03a3ff03326ebf","user":{"login":"jackh726","id":31162821,"node_id":"MDQ6VXNlcjMxMTYyODIx","avatar_url":"https://avatars.githubusercontent.com/u/31162821?v=4","gravatar_id":"","url":"https://api.github.com/users/jackh726","html_url":"https://github.com/jackh726","followers_url":"https://api.github.com/users/jackh726/followers","following_url":"https://api.github.com/users/jackh726/following{/other_user}","gists_url":"https://api.github.com/users/jackh726/gists{/gist_id}","starred_url":"https://api.github.com/users/jackh726/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/jackh726/subscriptions","organizations_url":"https://api.github.com/users/jackh726/orgs","repos_url":"https://api.github.com/users/jackh726/repos","events_url":"https://api.github.com/users/jackh726/events{/privacy}","received_events_url":"https://api.github.com/users/jackh726/received_events","type":"User","site_admin":false},"repo":{"id":182300329,"node_id":"MDEwOlJlcG9zaXRvcnkxODIzMDAzMjk=","name":"bigtools","full_name":"jackh726/bigtools","private":false,"owner":{"login":"jackh726","id":31162821,"node_id":"MDQ6VXNlcjMxMTYyODIx","avatar_url":"https://avatars.githubusercontent.com/u/31162821?v=4","gravatar_id":"","url":"https://api.github.com/users/jackh726","html_url":"https://github.com/jackh726","followers_url":"https://api.github.com/users/jackh726/followers","following_url":"https://api.github.com/users/jackh726/following{/other_user}","gists_url":"https://api.github.com/users/jackh726/gists{/gist_id}","starred_url":"https://api.github.com/users/jackh726/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/jackh726/subscriptions","organizations_url":"https://api.github.com/users/jackh726/orgs","repos_url":"https://api.github.com/users/jackh726/repos","events_url":"https://api.github.com/users/jackh726/events{/privacy}","received_events_url":"https://api.github.com/users/jackh726/received_events","type":"User","site_admin":false},"html_url":"https://github.com/jackh726/bigtools","description":"A high-performance BigWig and BigBed library in Rust","fork":false,"url":"https://api.github.com/repos/jackh726/bigtools","forks_url":"https://api.github.com/repos/jackh726/bigtools/forks","keys_url":"https://api.github.com/repos/jackh726/bigtools/keys{/key_id}","collaborators_url":"https://api.github.com/repos/jackh726/bigtools/collaborators{/collaborator}","teams_url":"https://api.github.com/repos/jackh726/bigtools/teams","hooks_url":"https://api.github.com/repos/jackh726/bigtools/hooks","issue_events_url":"https://api.github.com/repos/jackh726/bigtools/issues/events{/number}","events_url":"https://api.github.com/repos/jackh726/bigtools/events","assignees_url":"https://api.github.com/repos/jackh726/bigtools/assignees{/user}","branches_url":"https://api.github.com/repos/jackh726/bigtools/branches{/branch}","tags_url":"https://api.github.com/repos/jackh726/bigtools/tags","blobs_url":"https://api.github.com/repos/jackh726/bigtools/git/blobs{/sha}","git_tags_url":"https://api.github.com/repos/jackh726/bigtools/git/tags{/sha}","git_refs_url":"https://api.github.com/repos/jackh726/bigtools/git/refs{/sha}","trees_url":"https://api.github.com/repos/jackh726/bigtools/git/trees{/sha}","statuses_url":"https://api.github.com/repos/jackh726/bigtools/statuses/{sha}","languages_url":"https://api.github.com/repos/jackh726/bigtools/languages","stargazers_url":"https://api.github.com/repos/jackh726/bigtools/stargazers","contributors_url":"https://api.github.com/repos/jackh726/bigtools/contributors","subscribers_url":"https://api.github.com/repos/jackh726/bigtools/subscribers","subscription_url":"https://api.github.com/repos/jackh726/bigtools/subscription","commits_url":"https://api.github.com/repos/jackh726/bigtools/commits{/sha}","git_commits_url":"https://api.github.com/repos/jackh726/bigtools/git/commits{/sha}","comments_url":"https://api.github.com/repos/jackh726/bigtools/comments{/number}","issue_comment_url":"https://api.github.com/repos/jackh726/bigtools/issues/comments{/number}","contents_url":"https://api.github.com/repos/jackh726/bigtools/contents/{+path}","compare_url":"https://api.github.com/repos/jackh726/bigtools/compare/{base}...{head}","merges_url":"https://api.github.com/repos/jackh726/bigtools/merges","archive_url":"https://api.github.com/repos/jackh726/bigtools/{archive_format}{/ref}","downloads_url":"https://api.github.com/repos/jackh726/bigtools/downloads","issues_url":"https://api.github.com/repos/jackh726/bigtools/issues{/number}","pulls_url":"https://api.github.com/repos/jackh726/bigtools/pulls{/number}","milestones_url":"https://api.github.com/repos/jackh726/bigtools/milestones{/number}","notifications_url":"https://api.github.com/repos/jackh726/bigtools/notifications{?since,all,participating}","labels_url":"https://api.github.com/repos/jackh726/bigtools/labels{/name}","releases_url":"https://api.github.com/repos/jackh726/bigtools/releases{/id}","deployments_url":"https://api.github.com/repos/jackh726/bigtools/deployments","created_at":"2019-04-19T17:16:52Z","updated_at":"2024-02-20T12:55:39Z","pushed_at":"2024-03-18T15:41:01Z","git_url":"git://github.com/jackh726/bigtools.git","ssh_url":"git@github.com:jackh726/bigtools.git","clone_url":"https://github.com/jackh726/bigtools.git","svn_url":"https://github.com/jackh726/bigtools","homepage":"","size":4605,"stargazers_count":26,"watchers_count":26,"language":"Rust","has_issues":true,"has_projects":true,"has_downloads":true,"has_wiki":true,"has_pages":false,"has_discussions":true,"forks_count":4,"mirror_url":null,"archived":false,"disabled":false,"open_issues_count":2,"license":{"key":"mit","name":"MIT License","spdx_id":"MIT","url":"https://api.github.com/licenses/mit","node_id":"MDc6TGljZW5zZTEz"},"allow_forking":true,"is_template":false,"web_commit_signoff_required":false,"topics":["bigbed","bigwig","rust"],"visibility":"public","forks":4,"open_issues":2,"watchers":26,"default_branch":"master"}},"_links":{"self":{"href":"https://api.github.com/repos/jackh726/bigtools/pulls/32"},"html":{"href":"https://github.com/jackh726/bigtools/pull/32"},"issue":{"href":"https://api.github.com/repos/jackh726/bigtools/issues/32"},"comments":{"href":"https://api.github.com/repos/jackh726/bigtools/issues/32/comments"},"review_comments":{"href":"https://api.github.com/repos/jackh726/bigtools/pulls/32/comments"},"review_comment":{"href":"https://api.github.com/repos/jackh726/bigtools/pulls/comments{/number}"},"commits":{"href":"https://api.github.com/repos/jackh726/bigtools/pulls/32/commits"},"statuses":{"href":"https://api.github.com/repos/jackh726/bigtools/statuses/195cc05b7e7de3bf4f57422344e8764530c4be1e"}},"author_association":"CONTRIBUTOR","auto_merge":null,"active_lock_reason":null,"merged":false,"mergeable":null,"rebaseable":null,"mergeable_state":"unknown","merged_by":null,"comments":0,"review_comments":0,"maintainer_can_modify":true,"commits":1,"additions":1,"deletions":0,"changed_files":1}},"public":true,"created_at":"2024-03-18T15:41:02Z"},{"id":"36650875174","type":"CreateEvent","actor":{"id":1299177,"login":"ghuls","display_login":"ghuls","gravatar_id":"","url":"https://api.github.com/users/ghuls","avatar_url":"https://avatars.githubusercontent.com/u/1299177?"},"repo":{"id":706606164,"name":"ghuls/bigtools","url":"https://api.github.com/repos/ghuls/bigtools"},"payload":{"ref":"add_version_package_attribute","ref_type":"branch","master_branch":"master","description":null,"pusher_type":"user"},"public":true,"created_at":"2024-03-18T15:40:52Z"},{"id":"36650084514","type":"PullRequestEvent","actor":{"id":1299177,"login":"ghuls","display_login":"ghuls","gravatar_id":"","url":"https://api.github.com/users/ghuls","avatar_url":"https://avatars.githubusercontent.com/u/1299177?"},"repo":{"id":641466193,"name":"abdenlab/oxbow","url":"https://api.github.com/repos/abdenlab/oxbow"},"payload":{"action":"opened","number":63,"pull_request":{"url":"https://api.github.com/repos/abdenlab/oxbow/pulls/63","id":1777671283,"node_id":"PR_kwDOJjv_Uc5p9Rxz","html_url":"https://github.com/abdenlab/oxbow/pull/63","diff_url":"https://github.com/abdenlab/oxbow/pull/63.diff","patch_url":"https://github.com/abdenlab/oxbow/pull/63.patch","issue_url":"https://api.github.com/repos/abdenlab/oxbow/issues/63","number":63,"state":"open","locked":false,"title":"Update dependencies","user":{"login":"ghuls","id":1299177,"node_id":"MDQ6VXNlcjEyOTkxNzc=","avatar_url":"https://avatars.githubusercontent.com/u/1299177?v=4","gravatar_id":"","url":"https://api.github.com/users/ghuls","html_url":"https://github.com/ghuls","followers_url":"https://api.github.com/users/ghuls/followers","following_url":"https://api.github.com/users/ghuls/following{/other_user}","gists_url":"https://api.github.com/users/ghuls/gists{/gist_id}","starred_url":"https://api.github.com/users/ghuls/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/ghuls/subscriptions","organizations_url":"https://api.github.com/users/ghuls/orgs","repos_url":"https://api.github.com/users/ghuls/repos","events_url":"https://api.github.com/users/ghuls/events{/privacy}","received_events_url":"https://api.github.com/users/ghuls/received_events","type":"User","site_admin":false},"body":"- Update bigwig to major new version.\r\n- Update arrow to newest version.\r\n- Update noodles to 0.59.0. Newer versions require more substantial changes.","created_at":"2024-03-18T15:20:30Z","updated_at":"2024-03-18T15:20:30Z","closed_at":null,"merged_at":null,"merge_commit_sha":null,"assignee":null,"assignees":[],"requested_reviewers":[],"requested_teams":[],"labels":[],"milestone":null,"draft":false,"commits_url":"https://api.github.com/repos/abdenlab/oxbow/pulls/63/commits","review_comments_url":"https://api.github.com/repos/abdenlab/oxbow/pulls/63/comments","review_comment_url":"https://api.github.com/repos/abdenlab/oxbow/pulls/comments{/number}","comments_url":"https://api.github.com/repos/abdenlab/oxbow/issues/63/comments","statuses_url":"https://api.github.com/repos/abdenlab/oxbow/statuses/5d57f1ff8dd38daabec99ed1da8e778c69fdc0aa","head":{"label":"ghuls:update_dependencies","ref":"update_dependencies","sha":"5d57f1ff8dd38daabec99ed1da8e778c69fdc0aa","user":{"login":"ghuls","id":1299177,"node_id":"MDQ6VXNlcjEyOTkxNzc=","avatar_url":"https://avatars.githubusercontent.com/u/1299177?v=4","gravatar_id":"","url":"https://api.github.com/users/ghuls","html_url":"https://github.com/ghuls","followers_url":"https://api.github.com/users/ghuls/followers","following_url":"https://api.github.com/users/ghuls/following{/other_user}","gists_url":"https://api.github.com/users/ghuls/gists{/gist_id}","starred_url":"https://api.github.com/users/ghuls/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/ghuls/subscriptions","organizations_url":"https://api.github.com/users/ghuls/orgs","repos_url":"https://api.github.com/users/ghuls/repos","events_url":"https://api.github.com/users/ghuls/events{/privacy}","received_events_url":"https://api.github.com/users/ghuls/received_events","type":"User","site_admin":false},"repo":{"id":773788768,"node_id":"R_kgDOLh8UYA","name":"oxbow","full_name":"ghuls/oxbow","private":false,"owner":{"login":"ghuls","id":1299177,"node_id":"MDQ6VXNlcjEyOTkxNzc=","avatar_url":"https://avatars.githubusercontent.com/u/1299177?v=4","gravatar_id":"","url":"https://api.github.com/users/ghuls","html_url":"https://github.com/ghuls","followers_url":"https://api.github.com/users/ghuls/followers","following_url":"https://api.github.com/users/ghuls/following{/other_user}","gists_url":"https://api.github.com/users/ghuls/gists{/gist_id}","starred_url":"https://api.github.com/users/ghuls/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/ghuls/subscriptions","organizations_url":"https://api.github.com/users/ghuls/orgs","repos_url":"https://api.github.com/users/ghuls/repos","events_url":"https://api.github.com/users/ghuls/events{/privacy}","received_events_url":"https://api.github.com/users/ghuls/received_events","type":"User","site_admin":false},"html_url":"https://github.com/ghuls/oxbow","description":"Read specialized NGS formats as data frames in R, Python, and more.","fork":true,"url":"https://api.github.com/repos/ghuls/oxbow","forks_url":"https://api.github.com/repos/ghuls/oxbow/forks","keys_url":"https://api.github.com/repos/ghuls/oxbow/keys{/key_id}","collaborators_url":"https://api.github.com/repos/ghuls/oxbow/collaborators{/collaborator}","teams_url":"https://api.github.com/repos/ghuls/oxbow/teams","hooks_url":"https://api.github.com/repos/ghuls/oxbow/hooks","issue_events_url":"https://api.github.com/repos/ghuls/oxbow/issues/events{/number}","events_url":"https://api.github.com/repos/ghuls/oxbow/events","assignees_url":"https://api.github.com/repos/ghuls/oxbow/assignees{/user}","branches_url":"https://api.github.com/repos/ghuls/oxbow/branches{/branch}","tags_url":"https://api.github.com/repos/ghuls/oxbow/tags","blobs_url":"https://api.github.com/repos/ghuls/oxbow/git/blobs{/sha}","git_tags_url":"https://api.github.com/repos/ghuls/oxbow/git/tags{/sha}","git_refs_url":"https://api.github.com/repos/ghuls/oxbow/git/refs{/sha}","trees_url":"https://api.github.com/repos/ghuls/oxbow/git/trees{/sha}","statuses_url":"https://api.github.com/repos/ghuls/oxbow/statuses/{sha}","languages_url":"https://api.github.com/repos/ghuls/oxbow/languages","stargazers_url":"https://api.github.com/repos/ghuls/oxbow/stargazers","contributors_url":"https://api.github.com/repos/ghuls/oxbow/contributors","subscribers_url":"https://api.github.com/repos/ghuls/oxbow/subscribers","subscription_url":"https://api.github.com/repos/ghuls/oxbow/subscription","commits_url":"https://api.github.com/repos/ghuls/oxbow/commits{/sha}","git_commits_url":"https://api.github.com/repos/ghuls/oxbow/git/commits{/sha}","comments_url":"https://api.github.com/repos/ghuls/oxbow/comments{/number}","issue_comment_url":"https://api.github.com/repos/ghuls/oxbow/issues/comments{/number}","contents_url":"https://api.github.com/repos/ghuls/oxbow/contents/{+path}","compare_url":"https://api.github.com/repos/ghuls/oxbow/compare/{base}...{head}","merges_url":"https://api.github.com/repos/ghuls/oxbow/merges","archive_url":"https://api.github.com/repos/ghuls/oxbow/{archive_format}{/ref}","downloads_url":"https://api.github.com/repos/ghuls/oxbow/downloads","issues_url":"https://api.github.com/repos/ghuls/oxbow/issues{/number}","pulls_url":"https://api.github.com/repos/ghuls/oxbow/pulls{/number}","milestones_url":"https://api.github.com/repos/ghuls/oxbow/milestones{/number}","notifications_url":"https://api.github.com/repos/ghuls/oxbow/notifications{?since,all,participating}","labels_url":"https://api.github.com/repos/ghuls/oxbow/labels{/name}","releases_url":"https://api.github.com/repos/ghuls/oxbow/releases{/id}","deployments_url":"https://api.github.com/repos/ghuls/oxbow/deployments","created_at":"2024-03-18T12:01:17Z","updated_at":"2024-03-18T12:01:17Z","pushed_at":"2024-03-18T15:15:44Z","git_url":"git://github.com/ghuls/oxbow.git","ssh_url":"git@github.com:ghuls/oxbow.git","clone_url":"https://github.com/ghuls/oxbow.git","svn_url":"https://github.com/ghuls/oxbow","homepage":"https://lifeinbytes.substack.com/p/breaking-out-of-bioinformatic-data-silos","size":8486,"stargazers_count":0,"watchers_count":0,"language":null,"has_issues":false,"has_projects":true,"has_downloads":true,"has_wiki":true,"has_pages":false,"has_discussions":false,"forks_count":0,"mirror_url":null,"archived":false,"disabled":false,"open_issues_count":0,"license":{"key":"apache-2.0","name":"Apache License 2.0","spdx_id":"Apache-2.0","url":"https://api.github.com/licenses/apache-2.0","node_id":"MDc6TGljZW5zZTI="},"allow_forking":true,"is_template":false,"web_commit_signoff_required":false,"topics":[],"visibility":"public","forks":0,"open_issues":0,"watchers":0,"default_branch":"main"}},"base":{"label":"abdenlab:main","ref":"main","sha":"adc761b818d5e4e9195a2e1d1e2c0117fc5fbedb","user":{"login":"abdenlab","id":107208028,"node_id":"O_kgDOBmPdXA","avatar_url":"https://avatars.githubusercontent.com/u/107208028?v=4","gravatar_id":"","url":"https://api.github.com/users/abdenlab","html_url":"https://github.com/abdenlab","followers_url":"https://api.github.com/users/abdenlab/followers","following_url":"https://api.github.com/users/abdenlab/following{/other_user}","gists_url":"https://api.github.com/users/abdenlab/gists{/gist_id}","starred_url":"https://api.github.com/users/abdenlab/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/abdenlab/subscriptions","organizations_url":"https://api.github.com/users/abdenlab/orgs","repos_url":"https://api.github.com/users/abdenlab/repos","events_url":"https://api.github.com/users/abdenlab/events{/privacy}","received_events_url":"https://api.github.com/users/abdenlab/received_events","type":"Organization","site_admin":false},"repo":{"id":641466193,"node_id":"R_kgDOJjv_UQ","name":"oxbow","full_name":"abdenlab/oxbow","private":false,"owner":{"login":"abdenlab","id":107208028,"node_id":"O_kgDOBmPdXA","avatar_url":"https://avatars.githubusercontent.com/u/107208028?v=4","gravatar_id":"","url":"https://api.github.com/users/abdenlab","html_url":"https://github.com/abdenlab","followers_url":"https://api.github.com/users/abdenlab/followers","following_url":"https://api.github.com/users/abdenlab/following{/other_user}","gists_url":"https://api.github.com/users/abdenlab/gists{/gist_id}","starred_url":"https://api.github.com/users/abdenlab/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/abdenlab/subscriptions","organizations_url":"https://api.github.com/users/abdenlab/orgs","repos_url":"https://api.github.com/users/abdenlab/repos","events_url":"https://api.github.com/users/abdenlab/events{/privacy}","received_events_url":"https://api.github.com/users/abdenlab/received_events","type":"Organization","site_admin":false},"html_url":"https://github.com/abdenlab/oxbow","description":"Read specialized NGS formats as data frames in R, Python, and more.","fork":false,"url":"https://api.github.com/repos/abdenlab/oxbow","forks_url":"https://api.github.com/repos/abdenlab/oxbow/forks","keys_url":"https://api.github.com/repos/abdenlab/oxbow/keys{/key_id}","collaborators_url":"https://api.github.com/repos/abdenlab/oxbow/collaborators{/collaborator}","teams_url":"https://api.github.com/repos/abdenlab/oxbow/teams","hooks_url":"https://api.github.com/repos/abdenlab/oxbow/hooks","issue_events_url":"https://api.github.com/repos/abdenlab/oxbow/issues/events{/number}","events_url":"https://api.github.com/repos/abdenlab/oxbow/events","assignees_url":"https://api.github.com/repos/abdenlab/oxbow/assignees{/user}","branches_url":"https://api.github.com/repos/abdenlab/oxbow/branches{/branch}","tags_url":"https://api.github.com/repos/abdenlab/oxbow/tags","blobs_url":"https://api.github.com/repos/abdenlab/oxbow/git/blobs{/sha}","git_tags_url":"https://api.github.com/repos/abdenlab/oxbow/git/tags{/sha}","git_refs_url":"https://api.github.com/repos/abdenlab/oxbow/git/refs{/sha}","trees_url":"https://api.github.com/repos/abdenlab/oxbow/git/trees{/sha}","statuses_url":"https://api.github.com/repos/abdenlab/oxbow/statuses/{sha}","languages_url":"https://api.github.com/repos/abdenlab/oxbow/languages","stargazers_url":"https://api.github.com/repos/abdenlab/oxbow/stargazers","contributors_url":"https://api.github.com/repos/abdenlab/oxbow/contributors","subscribers_url":"https://api.github.com/repos/abdenlab/oxbow/subscribers","subscription_url":"https://api.github.com/repos/abdenlab/oxbow/subscription","commits_url":"https://api.github.com/repos/abdenlab/oxbow/commits{/sha}","git_commits_url":"https://api.github.com/repos/abdenlab/oxbow/git/commits{/sha}","comments_url":"https://api.github.com/repos/abdenlab/oxbow/comments{/number}","issue_comment_url":"https://api.github.com/repos/abdenlab/oxbow/issues/comments{/number}","contents_url":"https://api.github.com/repos/abdenlab/oxbow/contents/{+path}","compare_url":"https://api.github.com/repos/abdenlab/oxbow/compare/{base}...{head}","merges_url":"https://api.github.com/repos/abdenlab/oxbow/merges","archive_url":"https://api.github.com/repos/abdenlab/oxbow/{archive_format}{/ref}","downloads_url":"https://api.github.com/repos/abdenlab/oxbow/downloads","issues_url":"https://api.github.com/repos/abdenlab/oxbow/issues{/number}","pulls_url":"https://api.github.com/repos/abdenlab/oxbow/pulls{/number}","milestones_url":"https://api.github.com/repos/abdenlab/oxbow/milestones{/number}","notifications_url":"https://api.github.com/repos/abdenlab/oxbow/notifications{?since,all,participating}","labels_url":"https://api.github.com/repos/abdenlab/oxbow/labels{/name}","releases_url":"https://api.github.com/repos/abdenlab/oxbow/releases{/id}","deployments_url":"https://api.github.com/repos/abdenlab/oxbow/deployments","created_at":"2023-05-16T14:24:02Z","updated_at":"2024-03-13T05:39:49Z","pushed_at":"2024-03-18T15:20:31Z","git_url":"git://github.com/abdenlab/oxbow.git","ssh_url":"git@github.com:abdenlab/oxbow.git","clone_url":"https://github.com/abdenlab/oxbow.git","svn_url":"https://github.com/abdenlab/oxbow","homepage":"https://lifeinbytes.substack.com/p/breaking-out-of-bioinformatic-data-silos","size":8637,"stargazers_count":38,"watchers_count":38,"language":"Rust","has_issues":true,"has_projects":true,"has_downloads":true,"has_wiki":true,"has_pages":true,"has_discussions":false,"forks_count":6,"mirror_url":null,"archived":false,"disabled":false,"open_issues_count":11,"license":{"key":"apache-2.0","name":"Apache License 2.0","spdx_id":"Apache-2.0","url":"https://api.github.com/licenses/apache-2.0","node_id":"MDc6TGljZW5zZTI="},"allow_forking":true,"is_template":false,"web_commit_signoff_required":false,"topics":["apache-arrow","bioinformatics","data-science","dataframe","fair-data","genomics","multiomics","ngs","pandas","polars","python","r","rust-lang"],"visibility":"public","forks":6,"open_issues":11,"watchers":38,"default_branch":"main"}},"_links":{"self":{"href":"https://api.github.com/repos/abdenlab/oxbow/pulls/63"},"html":{"href":"https://github.com/abdenlab/oxbow/pull/63"},"issue":{"href":"https://api.github.com/repos/abdenlab/oxbow/issues/63"},"comments":{"href":"https://api.github.com/repos/abdenlab/oxbow/issues/63/comments"},"review_comments":{"href":"https://api.github.com/repos/abdenlab/oxbow/pulls/63/comments"},"review_comment":{"href":"https://api.github.com/repos/abdenlab/oxbow/pulls/comments{/number}"},"commits":{"href":"https://api.github.com/repos/abdenlab/oxbow/pulls/63/commits"},"statuses":{"href":"https://api.github.com/repos/abdenlab/oxbow/statuses/5d57f1ff8dd38daabec99ed1da8e778c69fdc0aa"}},"author_association":"NONE","auto_merge":null,"active_lock_reason":null,"merged":false,"mergeable":null,"rebaseable":null,"mergeable_state":"unknown","merged_by":null,"comments":0,"review_comments":0,"maintainer_can_modify":true,"commits":3,"additions":508,"deletions":219,"changed_files":7}},"public":true,"created_at":"2024-03-18T15:20:32Z","org":{"id":107208028,"login":"abdenlab","gravatar_id":"","url":"https://api.github.com/orgs/abdenlab","avatar_url":"https://avatars.githubusercontent.com/u/107208028?"}},{"id":"36649944687","type":"IssueCommentEvent","actor":{"id":1299177,"login":"ghuls","display_login":"ghuls","gravatar_id":"","url":"https://api.github.com/users/ghuls","avatar_url":"https://avatars.githubusercontent.com/u/1299177?"},"repo":{"id":124243757,"name":"aertslab/pySCENIC","url":"https://api.github.com/repos/aertslab/pySCENIC"},"payload":{"action":"created","issue":{"url":"https://api.github.com/repos/aertslab/pySCENIC/issues/532","repository_url":"https://api.github.com/repos/aertslab/pySCENIC","labels_url":"https://api.github.com/repos/aertslab/pySCENIC/issues/532/labels{/name}","comments_url":"https://api.github.com/repos/aertslab/pySCENIC/issues/532/comments","events_url":"https://api.github.com/repos/aertslab/pySCENIC/issues/532/events","html_url":"https://github.com/aertslab/pySCENIC/issues/532","id":2186498657,"node_id":"I_kwDOB2fPLc6CU1Jh","number":532,"title":"/usr/bin/sh: 1: pyscenic: not found [BUG]","user":{"login":"RolantusdataExp","id":75480608,"node_id":"MDQ6VXNlcjc1NDgwNjA4","avatar_url":"https://avatars.githubusercontent.com/u/75480608?v=4","gravatar_id":"","url":"https://api.github.com/users/RolantusdataExp","html_url":"https://github.com/RolantusdataExp","followers_url":"https://api.github.com/users/RolantusdataExp/followers","following_url":"https://api.github.com/users/RolantusdataExp/following{/other_user}","gists_url":"https://api.github.com/users/RolantusdataExp/gists{/gist_id}","starred_url":"https://api.github.com/users/RolantusdataExp/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/RolantusdataExp/subscriptions","organizations_url":"https://api.github.com/users/RolantusdataExp/orgs","repos_url":"https://api.github.com/users/RolantusdataExp/repos","events_url":"https://api.github.com/users/RolantusdataExp/events{/privacy}","received_events_url":"https://api.github.com/users/RolantusdataExp/received_events","type":"User","site_admin":false},"labels":[{"id":859202749,"node_id":"MDU6TGFiZWw4NTkyMDI3NDk=","url":"https://api.github.com/repos/aertslab/pySCENIC/labels/bug","name":"bug","color":"d73a4a","default":true,"description":"Something isn't working"}],"state":"open","locked":false,"assignee":null,"assignees":[],"milestone":null,"comments":3,"created_at":"2024-03-14T14:16:38Z","updated_at":"2024-03-18T15:17:00Z","closed_at":null,"author_association":"NONE","active_lock_reason":null,"body":"HI, \r\nI think this is more an issue of defining my environment path, than it is a pyScenic problem, however when I try to run: \r\n\r\n!pyscenic grn {loom_path} {tf_names} -o outpath_adj --num_workers 20 \r\n\r\nI get this error: /usr/bin/sh: 1: pyscenic: not found\r\n\r\nI have pyscenic installed and imported to my JupyterNotebook. I guess the error says that it can't find pyScenic in the path described, however I don't know how to fix this\r\n\r\nBest, \r\nPeter \r\n\r\n```\r\n","reactions":{"url":"https://api.github.com/repos/aertslab/pySCENIC/issues/532/reactions","total_count":0,"+1":0,"-1":0,"laugh":0,"hooray":0,"confused":0,"heart":0,"rocket":0,"eyes":0},"timeline_url":"https://api.github.com/repos/aertslab/pySCENIC/issues/532/timeline","performed_via_github_app":null,"state_reason":null},"comment":{"url":"https://api.github.com/repos/aertslab/pySCENIC/issues/comments/2004192327","html_url":"https://github.com/aertslab/pySCENIC/issues/532#issuecomment-2004192327","issue_url":"https://api.github.com/repos/aertslab/pySCENIC/issues/532","id":2004192327,"node_id":"IC_kwDOB2fPLc53dYxH","user":{"login":"ghuls","id":1299177,"node_id":"MDQ6VXNlcjEyOTkxNzc=","avatar_url":"https://avatars.githubusercontent.com/u/1299177?v=4","gravatar_id":"","url":"https://api.github.com/users/ghuls","html_url":"https://github.com/ghuls","followers_url":"https://api.github.com/users/ghuls/followers","following_url":"https://api.github.com/users/ghuls/following{/other_user}","gists_url":"https://api.github.com/users/ghuls/gists{/gist_id}","starred_url":"https://api.github.com/users/ghuls/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/ghuls/subscriptions","organizations_url":"https://api.github.com/users/ghuls/orgs","repos_url":"https://api.github.com/users/ghuls/repos","events_url":"https://api.github.com/users/ghuls/events{/privacy}","received_events_url":"https://api.github.com/users/ghuls/received_events","type":"User","site_admin":false},"created_at":"2024-03-18T15:16:59Z","updated_at":"2024-03-18T15:16:59Z","author_association":"MEMBER","body":"What is the output of `! whereis pyscenic`","reactions":{"url":"https://api.github.com/repos/aertslab/pySCENIC/issues/comments/2004192327/reactions","total_count":0,"+1":0,"-1":0,"laugh":0,"hooray":0,"confused":0,"heart":0,"rocket":0,"eyes":0},"performed_via_github_app":null}},"public":true,"created_at":"2024-03-18T15:17:00Z","org":{"id":3940817,"login":"aertslab","gravatar_id":"","url":"https://api.github.com/orgs/aertslab","avatar_url":"https://avatars.githubusercontent.com/u/3940817?"}},{"id":"36649891812","type":"PushEvent","actor":{"id":1299177,"login":"ghuls","display_login":"ghuls","gravatar_id":"","url":"https://api.github.com/users/ghuls","avatar_url":"https://avatars.githubusercontent.com/u/1299177?"},"repo":{"id":773788768,"name":"ghuls/oxbow","url":"https://api.github.com/repos/ghuls/oxbow"},"payload":{"repository_id":773788768,"push_id":17593923515,"size":1,"distinct_size":1,"ref":"refs/heads/update_dependencies","head":"5d57f1ff8dd38daabec99ed1da8e778c69fdc0aa","before":"4b1a03c947581f37717de0574353c6bc55c20117","commits":[{"sha":"5d57f1ff8dd38daabec99ed1da8e778c69fdc0aa","author":{"email":"gert.hulselmans@kuleuven.be","name":"Gert Hulselmans"},"message":"Update arrow dependency to 51.\n\nUpdate arrow dependency to 51.","distinct":true,"url":"https://api.github.com/repos/ghuls/oxbow/commits/5d57f1ff8dd38daabec99ed1da8e778c69fdc0aa"}]},"public":true,"created_at":"2024-03-18T15:15:45Z"},{"id":"36642497496","type":"CreateEvent","actor":{"id":1299177,"login":"ghuls","display_login":"ghuls","gravatar_id":"","url":"https://api.github.com/users/ghuls","avatar_url":"https://avatars.githubusercontent.com/u/1299177?"},"repo":{"id":773788768,"name":"ghuls/oxbow","url":"https://api.github.com/repos/ghuls/oxbow"},"payload":{"ref":"update_dependencies","ref_type":"branch","master_branch":"main","description":"Read specialized NGS formats as data frames in R, Python, and more.","pusher_type":"user"},"public":true,"created_at":"2024-03-18T12:01:43Z"},{"id":"36642483138","type":"ForkEvent","actor":{"id":1299177,"login":"ghuls","display_login":"ghuls","gravatar_id":"","url":"https://api.github.com/users/ghuls","avatar_url":"https://avatars.githubusercontent.com/u/1299177?"},"repo":{"id":641466193,"name":"abdenlab/oxbow","url":"https://api.github.com/repos/abdenlab/oxbow"},"payload":{"forkee":{"id":773788768,"node_id":"R_kgDOLh8UYA","name":"oxbow","full_name":"ghuls/oxbow","private":false,"owner":{"login":"ghuls","id":1299177,"node_id":"MDQ6VXNlcjEyOTkxNzc=","avatar_url":"https://avatars.githubusercontent.com/u/1299177?v=4","gravatar_id":"","url":"https://api.github.com/users/ghuls","html_url":"https://github.com/ghuls","followers_url":"https://api.github.com/users/ghuls/followers","following_url":"https://api.github.com/users/ghuls/following{/other_user}","gists_url":"https://api.github.com/users/ghuls/gists{/gist_id}","starred_url":"https://api.github.com/users/ghuls/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/ghuls/subscriptions","organizations_url":"https://api.github.com/users/ghuls/orgs","repos_url":"https://api.github.com/users/ghuls/repos","events_url":"https://api.github.com/users/ghuls/events{/privacy}","received_events_url":"https://api.github.com/users/ghuls/received_events","type":"User","site_admin":false},"html_url":"https://github.com/ghuls/oxbow","description":"Read specialized NGS formats as data frames in R, Python, and more.","fork":true,"url":"https://api.github.com/repos/ghuls/oxbow","forks_url":"https://api.github.com/repos/ghuls/oxbow/forks","keys_url":"https://api.github.com/repos/ghuls/oxbow/keys{/key_id}","collaborators_url":"https://api.github.com/repos/ghuls/oxbow/collaborators{/collaborator}","teams_url":"https://api.github.com/repos/ghuls/oxbow/teams","hooks_url":"https://api.github.com/repos/ghuls/oxbow/hooks","issue_events_url":"https://api.github.com/repos/ghuls/oxbow/issues/events{/number}","events_url":"https://api.github.com/repos/ghuls/oxbow/events","assignees_url":"https://api.github.com/repos/ghuls/oxbow/assignees{/user}","branches_url":"https://api.github.com/repos/ghuls/oxbow/branches{/branch}","tags_url":"https://api.github.com/repos/ghuls/oxbow/tags","blobs_url":"https://api.github.com/repos/ghuls/oxbow/git/blobs{/sha}","git_tags_url":"https://api.github.com/repos/ghuls/oxbow/git/tags{/sha}","git_refs_url":"https://api.github.com/repos/ghuls/oxbow/git/refs{/sha}","trees_url":"https://api.github.com/repos/ghuls/oxbow/git/trees{/sha}","statuses_url":"https://api.github.com/repos/ghuls/oxbow/statuses/{sha}","languages_url":"https://api.github.com/repos/ghuls/oxbow/languages","stargazers_url":"https://api.github.com/repos/ghuls/oxbow/stargazers","contributors_url":"https://api.github.com/repos/ghuls/oxbow/contributors","subscribers_url":"https://api.github.com/repos/ghuls/oxbow/subscribers","subscription_url":"https://api.github.com/repos/ghuls/oxbow/subscription","commits_url":"https://api.github.com/repos/ghuls/oxbow/commits{/sha}","git_commits_url":"https://api.github.com/repos/ghuls/oxbow/git/commits{/sha}","comments_url":"https://api.github.com/repos/ghuls/oxbow/comments{/number}","issue_comment_url":"https://api.github.com/repos/ghuls/oxbow/issues/comments{/number}","contents_url":"https://api.github.com/repos/ghuls/oxbow/contents/{+path}","compare_url":"https://api.github.com/repos/ghuls/oxbow/compare/{base}...{head}","merges_url":"https://api.github.com/repos/ghuls/oxbow/merges","archive_url":"https://api.github.com/repos/ghuls/oxbow/{archive_format}{/ref}","downloads_url":"https://api.github.com/repos/ghuls/oxbow/downloads","issues_url":"https://api.github.com/repos/ghuls/oxbow/issues{/number}","pulls_url":"https://api.github.com/repos/ghuls/oxbow/pulls{/number}","milestones_url":"https://api.github.com/repos/ghuls/oxbow/milestones{/number}","notifications_url":"https://api.github.com/repos/ghuls/oxbow/notifications{?since,all,participating}","labels_url":"https://api.github.com/repos/ghuls/oxbow/labels{/name}","releases_url":"https://api.github.com/repos/ghuls/oxbow/releases{/id}","deployments_url":"https://api.github.com/repos/ghuls/oxbow/deployments","created_at":"2024-03-18T12:01:17Z","updated_at":"2024-03-18T12:01:17Z","pushed_at":"2024-01-26T20:33:39Z","git_url":"git://github.com/ghuls/oxbow.git","ssh_url":"git@github.com:ghuls/oxbow.git","clone_url":"https://github.com/ghuls/oxbow.git","svn_url":"https://github.com/ghuls/oxbow","homepage":"https://lifeinbytes.substack.com/p/breaking-out-of-bioinformatic-data-silos","size":8637,"stargazers_count":0,"watchers_count":0,"language":null,"has_issues":false,"has_projects":true,"has_downloads":true,"has_wiki":true,"has_pages":false,"has_discussions":false,"forks_count":0,"mirror_url":null,"archived":false,"disabled":false,"open_issues_count":0,"license":null,"allow_forking":true,"is_template":false,"web_commit_signoff_required":false,"topics":[],"visibility":"public","forks":0,"open_issues":0,"watchers":0,"default_branch":"main","public":true}},"public":true,"created_at":"2024-03-18T12:01:17Z","org":{"id":107208028,"login":"abdenlab","gravatar_id":"","url":"https://api.github.com/orgs/abdenlab","avatar_url":"https://avatars.githubusercontent.com/u/107208028?"}},{"id":"36639156181","type":"IssueCommentEvent","actor":{"id":1299177,"login":"ghuls","display_login":"ghuls","gravatar_id":"","url":"https://api.github.com/users/ghuls","avatar_url":"https://avatars.githubusercontent.com/u/1299177?"},"repo":{"id":17778869,"name":"alexdobin/STAR","url":"https://api.github.com/repos/alexdobin/STAR"},"payload":{"action":"created","issue":{"url":"https://api.github.com/repos/alexdobin/STAR/issues/2088","repository_url":"https://api.github.com/repos/alexdobin/STAR","labels_url":"https://api.github.com/repos/alexdobin/STAR/issues/2088/labels{/name}","comments_url":"https://api.github.com/repos/alexdobin/STAR/issues/2088/comments","events_url":"https://api.github.com/repos/alexdobin/STAR/issues/2088/events","html_url":"https://github.com/alexdobin/STAR/issues/2088","id":2179107291,"node_id":"I_kwDOAQ9Itc6B4onb","number":2088,"title":"CIGAR X and = instead of M","user":{"login":"ggPeti","id":3217744,"node_id":"MDQ6VXNlcjMyMTc3NDQ=","avatar_url":"https://avatars.githubusercontent.com/u/3217744?v=4","gravatar_id":"","url":"https://api.github.com/users/ggPeti","html_url":"https://github.com/ggPeti","followers_url":"https://api.github.com/users/ggPeti/followers","following_url":"https://api.github.com/users/ggPeti/following{/other_user}","gists_url":"https://api.github.com/users/ggPeti/gists{/gist_id}","starred_url":"https://api.github.com/users/ggPeti/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/ggPeti/subscriptions","organizations_url":"https://api.github.com/users/ggPeti/orgs","repos_url":"https://api.github.com/users/ggPeti/repos","events_url":"https://api.github.com/users/ggPeti/events{/privacy}","received_events_url":"https://api.github.com/users/ggPeti/received_events","type":"User","site_admin":false},"labels":[],"state":"open","locked":false,"assignee":null,"assignees":[],"milestone":null,"comments":1,"created_at":"2024-03-11T13:14:07Z","updated_at":"2024-03-18T10:18:41Z","closed_at":null,"author_association":"NONE","active_lock_reason":null,"body":"Hello,\r\n\r\nI can't seem to find a way to make STAR output X and = in CIGAR string, only M. Yet, some comments online seem to hint at the possibility of this.\r\n\r\nAm I missing something? Please inform me.\r\n\r\nThank you!","reactions":{"url":"https://api.github.com/repos/alexdobin/STAR/issues/2088/reactions","total_count":0,"+1":0,"-1":0,"laugh":0,"hooray":0,"confused":0,"heart":0,"rocket":0,"eyes":0},"timeline_url":"https://api.github.com/repos/alexdobin/STAR/issues/2088/timeline","performed_via_github_app":null,"state_reason":null},"comment":{"url":"https://api.github.com/repos/alexdobin/STAR/issues/comments/2003475407","html_url":"https://github.com/alexdobin/STAR/issues/2088#issuecomment-2003475407","issue_url":"https://api.github.com/repos/alexdobin/STAR/issues/2088","id":2003475407,"node_id":"IC_kwDOAQ9Itc53apvP","user":{"login":"ghuls","id":1299177,"node_id":"MDQ6VXNlcjEyOTkxNzc=","avatar_url":"https://avatars.githubusercontent.com/u/1299177?v=4","gravatar_id":"","url":"https://api.github.com/users/ghuls","html_url":"https://github.com/ghuls","followers_url":"https://api.github.com/users/ghuls/followers","following_url":"https://api.github.com/users/ghuls/following{/other_user}","gists_url":"https://api.github.com/users/ghuls/gists{/gist_id}","starred_url":"https://api.github.com/users/ghuls/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/ghuls/subscriptions","organizations_url":"https://api.github.com/users/ghuls/orgs","repos_url":"https://api.github.com/users/ghuls/repos","events_url":"https://api.github.com/users/ghuls/events{/privacy}","received_events_url":"https://api.github.com/users/ghuls/received_events","type":"User","site_admin":false},"created_at":"2024-03-18T10:18:40Z","updated_at":"2024-03-18T10:18:40Z","author_association":"NONE","body":"You can always use:\r\n```\r\nsamtools calmd -e -o aln.with_X.bam aln.bam ref.fasta\r\n```","reactions":{"url":"https://api.github.com/repos/alexdobin/STAR/issues/comments/2003475407/reactions","total_count":0,"+1":0,"-1":0,"laugh":0,"hooray":0,"confused":0,"heart":0,"rocket":0,"eyes":0},"performed_via_github_app":null}},"public":true,"created_at":"2024-03-18T10:18:41Z"},{"id":"36582359738","type":"IssueCommentEvent","actor":{"id":1299177,"login":"ghuls","display_login":"ghuls","gravatar_id":"","url":"https://api.github.com/users/ghuls","avatar_url":"https://avatars.githubusercontent.com/u/1299177?"},"repo":{"id":329905726,"name":"aertslab/pycisTopic","url":"https://api.github.com/repos/aertslab/pycisTopic"},"payload":{"action":"created","issue":{"url":"https://api.github.com/repos/aertslab/pycisTopic/issues/14","repository_url":"https://api.github.com/repos/aertslab/pycisTopic","labels_url":"https://api.github.com/repos/aertslab/pycisTopic/issues/14/labels{/name}","comments_url":"https://api.github.com/repos/aertslab/pycisTopic/issues/14/comments","events_url":"https://api.github.com/repos/aertslab/pycisTopic/issues/14/events","html_url":"https://github.com/aertslab/pycisTopic/issues/14","id":918481285,"node_id":"MDU6SXNzdWU5MTg0ODEyODU=","number":14,"title":"Groupby steps when handling fragments [PERFORMANCE]","user":{"login":"cbravo93","id":19205518,"node_id":"MDQ6VXNlcjE5MjA1NTE4","avatar_url":"https://avatars.githubusercontent.com/u/19205518?v=4","gravatar_id":"","url":"https://api.github.com/users/cbravo93","html_url":"https://github.com/cbravo93","followers_url":"https://api.github.com/users/cbravo93/followers","following_url":"https://api.github.com/users/cbravo93/following{/other_user}","gists_url":"https://api.github.com/users/cbravo93/gists{/gist_id}","starred_url":"https://api.github.com/users/cbravo93/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/cbravo93/subscriptions","organizations_url":"https://api.github.com/users/cbravo93/orgs","repos_url":"https://api.github.com/users/cbravo93/repos","events_url":"https://api.github.com/users/cbravo93/events{/privacy}","received_events_url":"https://api.github.com/users/cbravo93/received_events","type":"User","site_admin":false},"labels":[{"id":2660203054,"node_id":"MDU6TGFiZWwyNjYwMjAzMDU0","url":"https://api.github.com/repos/aertslab/pycisTopic/labels/enhancement","name":"enhancement","color":"a2eeef","default":true,"description":"New feature or request"}],"state":"open","locked":false,"assignee":null,"assignees":[],"milestone":null,"comments":3,"created_at":"2021-06-11T09:13:12Z","updated_at":"2024-03-15T10:56:05Z","closed_at":null,"author_association":"MEMBER","active_lock_reason":null,"body":"**What type of problem are you experiencing and which function is you problem related too**\r\nWhen working with fragments files there are several steps where we use groupby operations to get matrices with barcodes and regions (or cut sites positions), which are demanding excessive memory. These occurs in qc functions (TSS, FRIP) and in the creation of cisTopicObjects. Also, but less problematic, in barcode_rank_plot, duplicate_rate and insert_size_distributions (here it is only use to aggregate scores by barcode and sum, not to make a matrix).\r\n\r\n**Is this problem data set related? If so, provide information on the problematic data set**\r\nThe performance of these steps is limited with medium-to-large data sets, and memory usage scales rather exponentially.\r\n\r\n**Have you identified the step of set of steps that lead to this problem in the function?**\r\nHere I will report the steps where groupby is used to create count matrices, but solutions may be applicable to barcode_rank_plot, duplicate_rate and insert_size_distributions too.\r\n- frip: qc.py: 640-641\r\n- get_tss_matrix: utils.py: 327-328\r\n- create_cistopic_object_from_fragments: cistopic_class.py: 774-791\r\n\r\n**Describe alternatives you've considered**\r\nSome solutions we have discussed:\r\n- Modin\r\n- Polars\r\n- Vaex\r\n\r\n**Additional context**\r\nAdd any other context or screenshots about the feature request here.\r\n\r\n**Version information**\r\nReport versions of modules relevant to this error\r\n\r\npycisTopic: 0.1.dev119+g4af8f57.d20210308\r\n","reactions":{"url":"https://api.github.com/repos/aertslab/pycisTopic/issues/14/reactions","total_count":0,"+1":0,"-1":0,"laugh":0,"hooray":0,"confused":0,"heart":0,"rocket":0,"eyes":0},"timeline_url":"https://api.github.com/repos/aertslab/pycisTopic/issues/14/timeline","performed_via_github_app":null,"state_reason":null},"comment":{"url":"https://api.github.com/repos/aertslab/pycisTopic/issues/comments/1999409154","html_url":"https://github.com/aertslab/pycisTopic/issues/14#issuecomment-1999409154","issue_url":"https://api.github.com/repos/aertslab/pycisTopic/issues/14","id":1999409154,"node_id":"IC_kwDOE6n2Ps53LJAC","user":{"login":"ghuls","id":1299177,"node_id":"MDQ6VXNlcjEyOTkxNzc=","avatar_url":"https://avatars.githubusercontent.com/u/1299177?v=4","gravatar_id":"","url":"https://api.github.com/users/ghuls","html_url":"https://github.com/ghuls","followers_url":"https://api.github.com/users/ghuls/followers","following_url":"https://api.github.com/users/ghuls/following{/other_user}","gists_url":"https://api.github.com/users/ghuls/gists{/gist_id}","starred_url":"https://api.github.com/users/ghuls/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/ghuls/subscriptions","organizations_url":"https://api.github.com/users/ghuls/orgs","repos_url":"https://api.github.com/users/ghuls/repos","events_url":"https://api.github.com/users/ghuls/events{/privacy}","received_events_url":"https://api.github.com/users/ghuls/received_events","type":"User","site_admin":false},"created_at":"2024-03-15T10:56:04Z","updated_at":"2024-03-15T10:56:04Z","author_association":"MEMBER","body":"@klgoss The polars branch is functional now for end-users:\r\n\r\nhttps://pycistopic.readthedocs.io/en/polars/notebooks/human_cerebellum.html#QC","reactions":{"url":"https://api.github.com/repos/aertslab/pycisTopic/issues/comments/1999409154/reactions","total_count":0,"+1":0,"-1":0,"laugh":0,"hooray":0,"confused":0,"heart":0,"rocket":0,"eyes":0},"performed_via_github_app":null}},"public":true,"created_at":"2024-03-15T10:56:05Z","org":{"id":3940817,"login":"aertslab","gravatar_id":"","url":"https://api.github.com/orgs/aertslab","avatar_url":"https://avatars.githubusercontent.com/u/3940817?"}},{"id":"36582091705","type":"IssueCommentEvent","actor":{"id":1299177,"login":"ghuls","display_login":"ghuls","gravatar_id":"","url":"https://api.github.com/users/ghuls","avatar_url":"https://avatars.githubusercontent.com/u/1299177?"},"repo":{"id":124243757,"name":"aertslab/pySCENIC","url":"https://api.github.com/repos/aertslab/pySCENIC"},"payload":{"action":"created","issue":{"url":"https://api.github.com/repos/aertslab/pySCENIC/issues/531","repository_url":"https://api.github.com/repos/aertslab/pySCENIC","labels_url":"https://api.github.com/repos/aertslab/pySCENIC/issues/531/labels{/name}","comments_url":"https://api.github.com/repos/aertslab/pySCENIC/issues/531/comments","events_url":"https://api.github.com/repos/aertslab/pySCENIC/issues/531/events","html_url":"https://github.com/aertslab/pySCENIC/issues/531","id":2172272029,"node_id":"I_kwDOB2fPLc6Bej2d","number":531,"title":"[BUG] pySCENIC installation error","user":{"login":"strawberry098","id":134091570,"node_id":"U_kgDOB_4TMg","avatar_url":"https://avatars.githubusercontent.com/u/134091570?v=4","gravatar_id":"","url":"https://api.github.com/users/strawberry098","html_url":"https://github.com/strawberry098","followers_url":"https://api.github.com/users/strawberry098/followers","following_url":"https://api.github.com/users/strawberry098/following{/other_user}","gists_url":"https://api.github.com/users/strawberry098/gists{/gist_id}","starred_url":"https://api.github.com/users/strawberry098/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/strawberry098/subscriptions","organizations_url":"https://api.github.com/users/strawberry098/orgs","repos_url":"https://api.github.com/users/strawberry098/repos","events_url":"https://api.github.com/users/strawberry098/events{/privacy}","received_events_url":"https://api.github.com/users/strawberry098/received_events","type":"User","site_admin":false},"labels":[{"id":859202749,"node_id":"MDU6TGFiZWw4NTkyMDI3NDk=","url":"https://api.github.com/repos/aertslab/pySCENIC/labels/bug","name":"bug","color":"d73a4a","default":true,"description":"Something isn't working"}],"state":"open","locked":false,"assignee":null,"assignees":[],"milestone":null,"comments":1,"created_at":"2024-03-06T19:33:09Z","updated_at":"2024-03-15T10:47:15Z","closed_at":null,"author_association":"NONE","active_lock_reason":null,"body":"I git clone with development version of pySCENIC. Upon `pip3 install .`, I get the following error:\r\n\r\nCollecting ctxcore>=0.2.0 (from pyscenic==0.12.1+7.g1cd059f)\r\n Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:877)'),)': /simple/ctxcore/\r\n Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:877)'),)': /simple/ctxcore/\r\n Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:877)'),)': /simple/ctxcore/\r\n Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:877)'),)': /simple/ctxcore/\r\n Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:877)'),)': /simple/ctxcore/\r\n Could not fetch URL https://pypi.python.org/simple/ctxcore/: There was a problem confirming the ssl certificate: HTTPSConnectionPool(host='pypi.org', port=443): Max retries exceeded with url: /simple/ctxcore/ (Caused by SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:877)'),)) - skipping\r\n Could not find a version that satisfies the requirement ctxcore>=0.2.0 (from pyscenic==0.12.1+7.g1cd059f) (from versions: )\r\nNo matching distribution found for ctxcore>=0.2.0 (from pyscenic==0.12.1+7.g1cd059f)\r\n\r\n","reactions":{"url":"https://api.github.com/repos/aertslab/pySCENIC/issues/531/reactions","total_count":0,"+1":0,"-1":0,"laugh":0,"hooray":0,"confused":0,"heart":0,"rocket":0,"eyes":0},"timeline_url":"https://api.github.com/repos/aertslab/pySCENIC/issues/531/timeline","performed_via_github_app":null,"state_reason":null},"comment":{"url":"https://api.github.com/repos/aertslab/pySCENIC/issues/comments/1999394817","html_url":"https://github.com/aertslab/pySCENIC/issues/531#issuecomment-1999394817","issue_url":"https://api.github.com/repos/aertslab/pySCENIC/issues/531","id":1999394817,"node_id":"IC_kwDOB2fPLc53LFgB","user":{"login":"ghuls","id":1299177,"node_id":"MDQ6VXNlcjEyOTkxNzc=","avatar_url":"https://avatars.githubusercontent.com/u/1299177?v=4","gravatar_id":"","url":"https://api.github.com/users/ghuls","html_url":"https://github.com/ghuls","followers_url":"https://api.github.com/users/ghuls/followers","following_url":"https://api.github.com/users/ghuls/following{/other_user}","gists_url":"https://api.github.com/users/ghuls/gists{/gist_id}","starred_url":"https://api.github.com/users/ghuls/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/ghuls/subscriptions","organizations_url":"https://api.github.com/users/ghuls/orgs","repos_url":"https://api.github.com/users/ghuls/repos","events_url":"https://api.github.com/users/ghuls/events{/privacy}","received_events_url":"https://api.github.com/users/ghuls/received_events","type":"User","site_admin":false},"created_at":"2024-03-15T10:47:13Z","updated_at":"2024-03-15T10:47:13Z","author_association":"MEMBER","body":"Could you try again? This error is not related to pySCENIC itself, but with pypi.org or your local certificate store.\r\nIt could be that there was a temporarily certificate problem on pypi's part, or you need to update your local certificate store to have the latest certificates.\r\n\r\nOn Ubuntu/Debian based distributions: https://www.cyberciti.biz/faq/update-ca-certificates-command-examples-in-linux-to-ssl-ca-certificates/","reactions":{"url":"https://api.github.com/repos/aertslab/pySCENIC/issues/comments/1999394817/reactions","total_count":0,"+1":0,"-1":0,"laugh":0,"hooray":0,"confused":0,"heart":0,"rocket":0,"eyes":0},"performed_via_github_app":null}},"public":true,"created_at":"2024-03-15T10:47:15Z","org":{"id":3940817,"login":"aertslab","gravatar_id":"","url":"https://api.github.com/orgs/aertslab","avatar_url":"https://avatars.githubusercontent.com/u/3940817?"}},{"id":"36581872486","type":"IssueCommentEvent","actor":{"id":1299177,"login":"ghuls","display_login":"ghuls","gravatar_id":"","url":"https://api.github.com/users/ghuls","avatar_url":"https://avatars.githubusercontent.com/u/1299177?"},"repo":{"id":124243757,"name":"aertslab/pySCENIC","url":"https://api.github.com/repos/aertslab/pySCENIC"},"payload":{"action":"created","issue":{"url":"https://api.github.com/repos/aertslab/pySCENIC/issues/532","repository_url":"https://api.github.com/repos/aertslab/pySCENIC","labels_url":"https://api.github.com/repos/aertslab/pySCENIC/issues/532/labels{/name}","comments_url":"https://api.github.com/repos/aertslab/pySCENIC/issues/532/comments","events_url":"https://api.github.com/repos/aertslab/pySCENIC/issues/532/events","html_url":"https://github.com/aertslab/pySCENIC/issues/532","id":2186498657,"node_id":"I_kwDOB2fPLc6CU1Jh","number":532,"title":"/usr/bin/sh: 1: pyscenic: not found [BUG]","user":{"login":"RolantusdataExp","id":75480608,"node_id":"MDQ6VXNlcjc1NDgwNjA4","avatar_url":"https://avatars.githubusercontent.com/u/75480608?v=4","gravatar_id":"","url":"https://api.github.com/users/RolantusdataExp","html_url":"https://github.com/RolantusdataExp","followers_url":"https://api.github.com/users/RolantusdataExp/followers","following_url":"https://api.github.com/users/RolantusdataExp/following{/other_user}","gists_url":"https://api.github.com/users/RolantusdataExp/gists{/gist_id}","starred_url":"https://api.github.com/users/RolantusdataExp/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/RolantusdataExp/subscriptions","organizations_url":"https://api.github.com/users/RolantusdataExp/orgs","repos_url":"https://api.github.com/users/RolantusdataExp/repos","events_url":"https://api.github.com/users/RolantusdataExp/events{/privacy}","received_events_url":"https://api.github.com/users/RolantusdataExp/received_events","type":"User","site_admin":false},"labels":[{"id":859202749,"node_id":"MDU6TGFiZWw4NTkyMDI3NDk=","url":"https://api.github.com/repos/aertslab/pySCENIC/labels/bug","name":"bug","color":"d73a4a","default":true,"description":"Something isn't working"}],"state":"open","locked":false,"assignee":null,"assignees":[],"milestone":null,"comments":1,"created_at":"2024-03-14T14:16:38Z","updated_at":"2024-03-15T10:39:44Z","closed_at":null,"author_association":"NONE","active_lock_reason":null,"body":"HI, \r\nI think this is more an issue of defining my environment path, than it is a pyScenic problem, however when I try to run: \r\n\r\n!pyscenic grn {loom_path} {tf_names} -o outpath_adj --num_workers 20 \r\n\r\nI get this error: /usr/bin/sh: 1: pyscenic: not found\r\n\r\nI have pyscenic installed and imported to my JupyterNotebook. I guess the error says that it can't find pyScenic in the path described, however I don't know how to fix this\r\n\r\nBest, \r\nPeter \r\n\r\n```\r\n","reactions":{"url":"https://api.github.com/repos/aertslab/pySCENIC/issues/532/reactions","total_count":0,"+1":0,"-1":0,"laugh":0,"hooray":0,"confused":0,"heart":0,"rocket":0,"eyes":0},"timeline_url":"https://api.github.com/repos/aertslab/pySCENIC/issues/532/timeline","performed_via_github_app":null,"state_reason":null},"comment":{"url":"https://api.github.com/repos/aertslab/pySCENIC/issues/comments/1999383010","html_url":"https://github.com/aertslab/pySCENIC/issues/532#issuecomment-1999383010","issue_url":"https://api.github.com/repos/aertslab/pySCENIC/issues/532","id":1999383010,"node_id":"IC_kwDOB2fPLc53LCni","user":{"login":"ghuls","id":1299177,"node_id":"MDQ6VXNlcjEyOTkxNzc=","avatar_url":"https://avatars.githubusercontent.com/u/1299177?v=4","gravatar_id":"","url":"https://api.github.com/users/ghuls","html_url":"https://github.com/ghuls","followers_url":"https://api.github.com/users/ghuls/followers","following_url":"https://api.github.com/users/ghuls/following{/other_user}","gists_url":"https://api.github.com/users/ghuls/gists{/gist_id}","starred_url":"https://api.github.com/users/ghuls/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/ghuls/subscriptions","organizations_url":"https://api.github.com/users/ghuls/orgs","repos_url":"https://api.github.com/users/ghuls/repos","events_url":"https://api.github.com/users/ghuls/events{/privacy}","received_events_url":"https://api.github.com/users/ghuls/received_events","type":"User","site_admin":false},"created_at":"2024-03-15T10:39:42Z","updated_at":"2024-03-15T10:39:42Z","author_association":"MEMBER","body":"How did you try to install it?\r\n\r\nDoes importing `pyscenic` work in python itself, or does that also not work?\r\n```python\r\nimport pyscenic\r\n```","reactions":{"url":"https://api.github.com/repos/aertslab/pySCENIC/issues/comments/1999383010/reactions","total_count":0,"+1":0,"-1":0,"laugh":0,"hooray":0,"confused":0,"heart":0,"rocket":0,"eyes":0},"performed_via_github_app":null}},"public":true,"created_at":"2024-03-15T10:39:44Z","org":{"id":3940817,"login":"aertslab","gravatar_id":"","url":"https://api.github.com/orgs/aertslab","avatar_url":"https://avatars.githubusercontent.com/u/3940817?"}},{"id":"36581736616","type":"IssuesEvent","actor":{"id":1299177,"login":"ghuls","display_login":"ghuls","gravatar_id":"","url":"https://api.github.com/users/ghuls","avatar_url":"https://avatars.githubusercontent.com/u/1299177?"},"repo":{"id":124243757,"name":"aertslab/pySCENIC","url":"https://api.github.com/repos/aertslab/pySCENIC"},"payload":{"action":"closed","issue":{"url":"https://api.github.com/repos/aertslab/pySCENIC/issues/523","repository_url":"https://api.github.com/repos/aertslab/pySCENIC","labels_url":"https://api.github.com/repos/aertslab/pySCENIC/issues/523/labels{/name}","comments_url":"https://api.github.com/repos/aertslab/pySCENIC/issues/523/comments","events_url":"https://api.github.com/repos/aertslab/pySCENIC/issues/523/events","html_url":"https://github.com/aertslab/pySCENIC/issues/523","id":2075878610,"node_id":"I_kwDOB2fPLc57u2TS","number":523,"title":"Scenic is not compatible with snakemake","user":{"login":"wangjiawen2013","id":29703450,"node_id":"MDQ6VXNlcjI5NzAzNDUw","avatar_url":"https://avatars.githubusercontent.com/u/29703450?v=4","gravatar_id":"","url":"https://api.github.com/users/wangjiawen2013","html_url":"https://github.com/wangjiawen2013","followers_url":"https://api.github.com/users/wangjiawen2013/followers","following_url":"https://api.github.com/users/wangjiawen2013/following{/other_user}","gists_url":"https://api.github.com/users/wangjiawen2013/gists{/gist_id}","starred_url":"https://api.github.com/users/wangjiawen2013/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/wangjiawen2013/subscriptions","organizations_url":"https://api.github.com/users/wangjiawen2013/orgs","repos_url":"https://api.github.com/users/wangjiawen2013/repos","events_url":"https://api.github.com/users/wangjiawen2013/events{/privacy}","received_events_url":"https://api.github.com/users/wangjiawen2013/received_events","type":"User","site_admin":false},"labels":[{"id":859202749,"node_id":"MDU6TGFiZWw4NTkyMDI3NDk=","url":"https://api.github.com/repos/aertslab/pySCENIC/labels/bug","name":"bug","color":"d73a4a","default":true,"description":"Something isn't working"}],"state":"closed","locked":false,"assignee":null,"assignees":[],"milestone":null,"comments":6,"created_at":"2024-01-11T06:40:14Z","updated_at":"2024-03-15T10:35:08Z","closed_at":"2024-03-15T10:35:08Z","author_association":"NONE","active_lock_reason":null,"body":"Hi,\r\n\r\nSnakemake is also a workflow management system. Snakemake is highly popular, with [>10 new citations per week](https://badge.dimensions.ai/details/id/pub.1018944052). For an introduction, please visit https://snakemake.github.io/.\r\n\r\nThough there are two Nextflow implementations available for scenic, I found that pyscenic can cause an error \"NameError: name 'snakemake' is not defined\" when running with Snakemake. Is it related to numba.jit decorator ? Could you provide Snakemake implementations for pyscenic ? \r\n\r\n![image](https://github.com/aertslab/pySCENIC/assets/29703450/b4e19fe1-a55c-4b32-879a-ff24545cff1b)\r\n","reactions":{"url":"https://api.github.com/repos/aertslab/pySCENIC/issues/523/reactions","total_count":0,"+1":0,"-1":0,"laugh":0,"hooray":0,"confused":0,"heart":0,"rocket":0,"eyes":0},"timeline_url":"https://api.github.com/repos/aertslab/pySCENIC/issues/523/timeline","performed_via_github_app":null,"state_reason":"completed"}},"public":true,"created_at":"2024-03-15T10:35:09Z","org":{"id":3940817,"login":"aertslab","gravatar_id":"","url":"https://api.github.com/orgs/aertslab","avatar_url":"https://avatars.githubusercontent.com/u/3940817?"}},{"id":"36581681712","type":"PushEvent","actor":{"id":1299177,"login":"ghuls","display_login":"ghuls","gravatar_id":"","url":"https://api.github.com/users/ghuls","avatar_url":"https://avatars.githubusercontent.com/u/1299177?"},"repo":{"id":329905726,"name":"aertslab/pycisTopic","url":"https://api.github.com/repos/aertslab/pycisTopic"},"payload":{"repository_id":329905726,"push_id":17558275855,"size":15,"distinct_size":15,"ref":"refs/heads/polars","head":"67eabaa75c9c23999b210e672c1e809b88ca80e7","before":"e81cb25f0d75d2b95c4353eefc1c83e5ab429df2","commits":[{"sha":"53fd583fd7cbad3fc0b641288abe8405d89e4e43","author":{"email":"gert.hulselmans@kuleuven.be","name":"Gert Hulselmans"},"message":"Change default number of iterations for topic modeling.\n\nChange default number of iterations for topic modeling.","distinct":true,"url":"https://api.github.com/repos/aertslab/pycisTopic/commits/53fd583fd7cbad3fc0b641288abe8405d89e4e43"},{"sha":"9fe7a102007fdc314a1320da49ceee680f857520","author":{"email":"gert.hulselmans@kuleuven.be","name":"Gert Hulselmans"},"message":"Update description of TSS subparser.\n\nUpdate description of TSS subparser.","distinct":true,"url":"https://api.github.com/repos/aertslab/pycisTopic/commits/9fe7a102007fdc314a1320da49ceee680f857520"},{"sha":"100cf33da7a5d6c490d588990f0de67324b51248","author":{"email":"gert.hulselmans@kuleuven.be","name":"Gert Hulselmans"},"message":"Update Ruff config settings to be compatible with more recent versions.\n\nUpdate Ruff config settings to be compatible with more recent versions.","distinct":true,"url":"https://api.github.com/repos/aertslab/pycisTopic/commits/100cf33da7a5d6c490d588990f0de67324b51248"},{"sha":"fd5350bb920989b6febf9e24f64c6758cf461979","author":{"email":"gert.hulselmans@kuleuven.be","name":"Gert Hulselmans"},"message":"Reformat lda_models.py with ruff and rearange imports.\n\nReformat lda_models.py with ruff and rearange imports and fix\n\"cisTopicObject\" to \"CistopicObject\".","distinct":true,"url":"https://api.github.com/repos/aertslab/pycisTopic/commits/fd5350bb920989b6febf9e24f64c6758cf461979"},{"sha":"767418b66cba6bb38bf0496df9dfa962595e940f","author":{"email":"gert.hulselmans@kuleuven.be","name":"Gert Hulselmans"},"message":"Fix \"cisTopicObject\" to \"CistopicObject\" in topic_binarization.py.\n\nFix \"cisTopicObject\" to \"CistopicObject\" in topic_binarization.py.","distinct":true,"url":"https://api.github.com/repos/aertslab/pycisTopic/commits/767418b66cba6bb38bf0496df9dfa962595e940f"},{"sha":"996a02bafa4f14cb8a931bf1194e32b7b7e26af5","author":{"email":"gert.hulselmans@kuleuven.be","name":"Gert Hulselmans"},"message":"Fix typo in \"accessible\".\n\nFix typo in \"accessible\".","distinct":true,"url":"https://api.github.com/repos/aertslab/pycisTopic/commits/996a02bafa4f14cb8a931bf1194e32b7b7e26af5"},{"sha":"bd2c8a26b2674ecf439accb648c49b3b137abd35","author":{"email":"gert.hulselmans@kuleuven.be","name":"Gert Hulselmans"},"message":"Remove \"--remove-stopwords\" argument from Mallet \"import-file\" command.\n\nRemove \"--remove-stopwords\" argument from Mallet \"import-file\" command\nas we always only have just numbers as text. In practice having this\nparameter or not, should not make any difference.","distinct":true,"url":"https://api.github.com/repos/aertslab/pycisTopic/commits/bd2c8a26b2674ecf439accb648c49b3b137abd35"},{"sha":"f24359a9ac6ba7d16ea33d2361b73ce7745ff795","author":{"email":"gert.hulselmans@kuleuven.be","name":"Gert Hulselmans"},"message":"Get num_models more efficiently and create id2word with FakeDict.\n\nGet num_models more efficiently and create id2word with FakeDict.\n\n`id2word` is only needed to get `num_terms` easily.\nAll other usages of `id2word` are not relevant anymore as\nthe int version will map to the string version of the same\nnumber in our case.\n\nThis allows to remove some slow code like:\n corpora.Dictionary.from_corpus(corpus, names_dict)","distinct":true,"url":"https://api.github.com/repos/aertslab/pycisTopic/commits/f24359a9ac6ba7d16ea33d2361b73ce7745ff795"},{"sha":"f8b8f663363ba7eace00f6f1d48836820fe32d48","author":{"email":"gert.hulselmans@kuleuven.be","name":"Gert Hulselmans"},"message":"Improve creation speed of corpus to Mallet file.\n\nImprove creation speed of corpus to Mallet file:\n - by not using lookups in id2word as they would be int to string\n representation of same number lookups.\n - by removing the repetition of the token id by the amount of counts\n for that token (which would always be one as we start from a\n binary matrix as input to generate the corpus.\n\nRemoved never used (in practice) arguments in convert_input and\nsimplified the code a bit.","distinct":true,"url":"https://api.github.com/repos/aertslab/pycisTopic/commits/f8b8f663363ba7eace00f6f1d48836820fe32d48"},{"sha":"a928ff201805e095ab8576692906dca08d61fffc","author":{"email":"gert.hulselmans@kuleuven.be","name":"Gert Hulselmans"},"message":"Remove some unused variables and imports.\n\nRemove some unused variables and imports.","distinct":true,"url":"https://api.github.com/repos/aertslab/pycisTopic/commits/a928ff201805e095ab8576692906dca08d61fffc"},{"sha":"c6c44825ddd3cfd0ee86bb7a105acb65f455ba3e","author":{"email":"gert.hulselmans@kuleuven.be","name":"Gert Hulselmans"},"message":"Construct Mallet command lines as a list of arguments instead of relying on the shell.\n\nConstruct Mallet command lines as a list of arguments instead of\nrelying on the shell.","distinct":true,"url":"https://api.github.com/repos/aertslab/pycisTopic/commits/c6c44825ddd3cfd0ee86bb7a105acb65f455ba3e"},{"sha":"17e5b6e44f4cb8e3847456cca955a46d9ddfd1d1","author":{"email":"gert.hulselmans@kuleuven.be","name":"Gert Hulselmans"},"message":"Reformat cistopic_class.py with ruff and rearange imports.\n\nReformat cistopic_class.py with ruff and rearange imports.","distinct":true,"url":"https://api.github.com/repos/aertslab/pycisTopic/commits/17e5b6e44f4cb8e3847456cca955a46d9ddfd1d1"},{"sha":"647d30d2d6c6e465ed6ee90e06f1e0795f704f2d","author":{"email":"gert.hulselmans@kuleuven.be","name":"Gert Hulselmans"},"message":"Display \"True\" and \"False\" option for \"--keep\" option in topic modeling CLI help.\n\nDisplay \"True\" and \"False\" option for \"--keep\" option in topic modeling CLI help.","distinct":true,"url":"https://api.github.com/repos/aertslab/pycisTopic/commits/647d30d2d6c6e465ed6ee90e06f1e0795f704f2d"},{"sha":"750fd63e1c6729dfc4659326e8b278e6b0d9b354","author":{"email":"gert.hulselmans@kuleuven.be","name":"Gert Hulselmans"},"message":"Add option to specify Mallet binary path in topic modeling CLI subcommand.\n\nAdd option to specify Mallet binary path in topic modeling CLI subcommand\nand make sure the directory to which intermediate models can be saved\nexists.","distinct":true,"url":"https://api.github.com/repos/aertslab/pycisTopic/commits/750fd63e1c6729dfc4659326e8b278e6b0d9b354"},{"sha":"67eabaa75c9c23999b210e672c1e809b88ca80e7","author":{"email":"gert.hulselmans@kuleuven.be","name":"Gert Hulselmans"},"message":"Speedup load_word_topics by reading Mallet state file with Polars.\n\nSpeedup load_word_topics by reading Mallet state file with Polars\nand doing the topic-region occurence calculation directly with\nPolars and by filling in the word_topics matrix in one go.","distinct":true,"url":"https://api.github.com/repos/aertslab/pycisTopic/commits/67eabaa75c9c23999b210e672c1e809b88ca80e7"}]},"public":true,"created_at":"2024-03-15T10:33:17Z","org":{"id":3940817,"login":"aertslab","gravatar_id":"","url":"https://api.github.com/orgs/aertslab","avatar_url":"https://avatars.githubusercontent.com/u/3940817?"}},{"id":"36483751445","type":"PushEvent","actor":{"id":1299177,"login":"ghuls","display_login":"ghuls","gravatar_id":"","url":"https://api.github.com/users/ghuls","avatar_url":"https://avatars.githubusercontent.com/u/1299177?"},"repo":{"id":329905726,"name":"aertslab/pycisTopic","url":"https://api.github.com/repos/aertslab/pycisTopic"},"payload":{"repository_id":329905726,"push_id":17510053409,"size":5,"distinct_size":5,"ref":"refs/heads/polars","head":"e81cb25f0d75d2b95c4353eefc1c83e5ab429df2","before":"b4fcbdf4e9b2eb3e6733f0ec89bb5a5f0d6ac3d5","commits":[{"sha":"092527f345ab60bcd949c0179162e97e6c67a856","author":{"email":"gert.hulselmans@kuleuven.be","name":"Gert Hulselmans"},"message":"Move TSS related subcommand code to separate file.\n\nMove TSS related subcommand code to separate file.","distinct":true,"url":"https://api.github.com/repos/aertslab/pycisTopic/commits/092527f345ab60bcd949c0179162e97e6c67a856"},{"sha":"423eb46b3bff849424edd397c0be19d14d92ad63","author":{"email":"gert.hulselmans@kuleuven.be","name":"Gert Hulselmans"},"message":"Move QC related subcommand code to separate file.\n\nMove QC related subcommand code to separate file.","distinct":true,"url":"https://api.github.com/repos/aertslab/pycisTopic/commits/423eb46b3bff849424edd397c0be19d14d92ad63"},{"sha":"6dabefba97b7d5df7d24d9cba6b303189e8e5fb9","author":{"email":"gert.hulselmans@kuleuven.be","name":"Gert Hulselmans"},"message":"Reformat topic modeling scripts with \"ruff\" and fix slight inconsitencies between the 2 scripts.\n\nReformat topic modeling scripts with \"ruff\" and fix slight\ninconsitencies between the 2 scripts.","distinct":true,"url":"https://api.github.com/repos/aertslab/pycisTopic/commits/6dabefba97b7d5df7d24d9cba6b303189e8e5fb9"},{"sha":"49e40a1194eeef7aec8e2a83e4afd3f990076f81","author":{"email":"gert.hulselmans@kuleuven.be","name":"Gert Hulselmans"},"message":"Rewrite argument handling in topic modeling scripts.\n\nRewrite argument handling in topic modeling scripts:\n - Make all short options 1 character long.\n - Use better argument names.\n - Update argument description.\n - Automaticly parse boolean string values to booleans\n during argparse processing.\n - Rearrange main function and print output.\n # Please enter the commit message for your changes. Lines starting","distinct":true,"url":"https://api.github.com/repos/aertslab/pycisTopic/commits/49e40a1194eeef7aec8e2a83e4afd3f990076f81"},{"sha":"e81cb25f0d75d2b95c4353eefc1c83e5ab429df2","author":{"email":"gert.hulselmans@kuleuven.be","name":"Gert Hulselmans"},"message":"Add topic modeling as subcommand to pycistopic CLI.\n\nAdd topic modeling as subcommand to pycistopic CLI and provide both\nlda and mallet as topic model implementation.\n\nRemove the standalone versions of the scripts.","distinct":true,"url":"https://api.github.com/repos/aertslab/pycisTopic/commits/e81cb25f0d75d2b95c4353eefc1c83e5ab429df2"}]},"public":true,"created_at":"2024-03-12T17:45:20Z","org":{"id":3940817,"login":"aertslab","gravatar_id":"","url":"https://api.github.com/orgs/aertslab","avatar_url":"https://avatars.githubusercontent.com/u/3940817?"}},{"id":"36473989325","type":"IssueCommentEvent","actor":{"id":1299177,"login":"ghuls","display_login":"ghuls","gravatar_id":"","url":"https://api.github.com/users/ghuls","avatar_url":"https://avatars.githubusercontent.com/u/1299177?"},"repo":{"id":23808489,"name":"lda-project/lda","url":"https://api.github.com/repos/lda-project/lda"},"payload":{"action":"created","issue":{"url":"https://api.github.com/repos/lda-project/lda/issues/128","repository_url":"https://api.github.com/repos/lda-project/lda","labels_url":"https://api.github.com/repos/lda-project/lda/issues/128/labels{/name}","comments_url":"https://api.github.com/repos/lda-project/lda/issues/128/comments","events_url":"https://api.github.com/repos/lda-project/lda/issues/128/events","html_url":"https://github.com/lda-project/lda/pull/128","id":2148984683,"node_id":"PR_kwDOAWtJ6c5npIx5","number":128,"title":"Construct nzw in fortran order to reduce cache misses in hot loops.","user":{"login":"ghuls","id":1299177,"node_id":"MDQ6VXNlcjEyOTkxNzc=","avatar_url":"https://avatars.githubusercontent.com/u/1299177?v=4","gravatar_id":"","url":"https://api.github.com/users/ghuls","html_url":"https://github.com/ghuls","followers_url":"https://api.github.com/users/ghuls/followers","following_url":"https://api.github.com/users/ghuls/following{/other_user}","gists_url":"https://api.github.com/users/ghuls/gists{/gist_id}","starred_url":"https://api.github.com/users/ghuls/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/ghuls/subscriptions","organizations_url":"https://api.github.com/users/ghuls/orgs","repos_url":"https://api.github.com/users/ghuls/repos","events_url":"https://api.github.com/users/ghuls/events{/privacy}","received_events_url":"https://api.github.com/users/ghuls/received_events","type":"User","site_admin":false},"labels":[],"state":"closed","locked":false,"assignee":null,"assignees":[],"milestone":null,"comments":3,"created_at":"2024-02-22T12:38:50Z","updated_at":"2024-03-12T13:22:10Z","closed_at":"2024-02-23T22:03:45Z","author_association":"CONTRIBUTOR","active_lock_reason":null,"draft":false,"pull_request":{"url":"https://api.github.com/repos/lda-project/lda/pulls/128","html_url":"https://github.com/lda-project/lda/pull/128","diff_url":"https://github.com/lda-project/lda/pull/128.diff","patch_url":"https://github.com/lda-project/lda/pull/128.patch","merged_at":"2024-02-23T22:03:45Z"},"body":"Construct nzw in fortran order to reduce cache misses in hot loops. as it is always iterater over in the topic direction.\r\n\r\nThis improves the speed of:\r\n - _sample_topics\r\n - _loglikelihood\r\n - ldac2dtm\r\n\r\nBenchmark script:\r\n\r\n $ cat bench_lda_fast.py #!/usr/bin/python3\r\n\r\n import os import sys import time\r\n\r\n import lda import lda.utils\r\n\r\n ldac_fn = sys.argv[1] t0 = time.time() dtm = lda.utils.ldac2dtm(open(ldac_fn), offset=0) print(\"ldac2dtm took: {} seconds\".format(time.time() - t0)) t1 = time.time() n_iter = 100 n_topics = 100 random_seed = 1 model = lda.LDA(n_topics=n_topics, n_iter=n_iter, random_state=random_seed) print(\"Model construction took: {} seconds\".format(time.time() - t1)) t2 = time.time() doc_topic = model.fit_transform(dtm) print(\"Running model took: {} seconds\".format(time.time() - t2)) print(\"Total: {} seconds\".format(time.time() - t0))\r\n\r\nOriginal code on reuters.ldac:\r\n\r\n $ python bench_lda_fast.py ../lda/tests/reuters.ldac ldac2dtm took: 0.6914153099060059 seconds Model construction took: 0.0010924339294433594 seconds INFO:lda:n_documents: 395 INFO:lda:vocab_size: 4258 INFO:lda:n_words: 84010 INFO:lda:n_topics: 100 INFO:lda:n_iter: 100 INFO:lda:<0> log likelihood: -1258788 INFO:lda:<10> log likelihood: -734487 INFO:lda:<20> log likelihood: -708068 INFO:lda:<30> log likelihood: -698147 INFO:lda:<40> log likelihood: -691363 INFO:lda:<50> log likelihood: -687491 INFO:lda:<60> log likelihood: -684165 INFO:lda:<70> log likelihood: -682336 INFO:lda:<80> log likelihood: -680400 INFO:lda:<90> log likelihood: -678905 INFO:lda:<99> log likelihood: -676797 Running model took: 8.397335290908813 seconds Total: 9.089883804321289 seconds\r\n\r\nNew code on reuters.ldac:\r\n\r\n $ python bench_lda_fast.py ../lda/tests/reuters.ldac ldac2dtm took: 0.6984512805938721 seconds Model construction took: 0.001010894775390625 seconds INFO:lda:n_documents: 395 INFO:lda:vocab_size: 4258 INFO:lda:n_words: 84010 INFO:lda:n_topics: 100 INFO:lda:n_iter: 100 INFO:lda:<0> log likelihood: -1258788 INFO:lda:<10> log likelihood: -734487 INFO:lda:<20> log likelihood: -708068 INFO:lda:<30> log likelihood: -698147 INFO:lda:<40> log likelihood: -691363 INFO:lda:<50> log likelihood: -687491 INFO:lda:<60> log likelihood: -684165 INFO:lda:<70> log likelihood: -682336 INFO:lda:<80> log likelihood: -680400 INFO:lda:<90> log likelihood: -678905 INFO:lda:<99> log likelihood: -676797 Running model took: 6.758747816085815 seconds Total: 7.458248615264893 seconds\r\n\r\nOld code on a bigger dataset:\r\n\r\n $ python bench_lda_fast.py words_5000_docs_10000_sample_freq_0.05.ldac ldac2dtm took: 29.605552911758423 seconds Model construction took: 0.0009605884552001953 seconds INFO:lda:n_documents: 10000 INFO:lda:vocab_size: 5000 INFO:lda:n_words: 2501477 INFO:lda:n_topics: 100 INFO:lda:n_iter: 100 INFO:lda:<0> log likelihood: -36823892 INFO:lda:<10> log likelihood: -27142527 INFO:lda:<20> log likelihood: -26367123 INFO:lda:<30> log likelihood: -26054203 INFO:lda:<40> log likelihood: -25886173 INFO:lda:<50> log likelihood: -25782106 INFO:lda:<60> log likelihood: -25719117 INFO:lda:<70> log likelihood: -25671755 INFO:lda:<80> log likelihood: -25639538 INFO:lda:<90> log likelihood: -25611296 INFO:lda:<99> log likelihood: -25589423 Running model took: 260.66207242012024 seconds Total: 290.2686412334442 seconds\r\n\r\nNew code on a bigger dataset:\r\n\r\n $ python bench_lda_fast.py words_5000_docs_10000_sample_freq_0.05.ldac ldac2dtm took: 27.78724980354309 seconds Model construction took: 0.0008566379547119141 seconds INFO:lda:n_documents: 10000 INFO:lda:vocab_size: 5000 INFO:lda:n_words: 2501477 INFO:lda:n_topics: 100 INFO:lda:n_iter: 100 INFO:lda:<0> log likelihood: -36823892 INFO:lda:<10> log likelihood: -27142527 INFO:lda:<20> log likelihood: -26367123 INFO:lda:<30> log likelihood: -26054203 INFO:lda:<40> log likelihood: -25886173 INFO:lda:<50> log likelihood: -25782106 INFO:lda:<60> log likelihood: -25719117 INFO:lda:<70> log likelihood: -25671755 INFO:lda:<80> log likelihood: -25639538 INFO:lda:<90> log likelihood: -25611296 INFO:lda:<99> log likelihood: -25589423 Running model took: 199.07503652572632 seconds Total: 226.86320090293884 seconds\r\n\r\nTo generate sample datasets:\r\n\r\n generate_ldac() { local n_words=\"${1}\"; local n_docs=\"${2}\"; local sample_freq=\"${3}\";\r\n\r\n awk \\\r\n -v n_words=\"${n_words}\" \\\r\n -v n_docs=\"${n_docs}\" \\\r\n -v sample_freq=\"${sample_freq}\" \\\r\n '\r\n BEGIN {\r\n sample_freq_threshold = 1 - sample_freq;\r\n\r\n for ( doc_idx = 0; doc_idx < n_docs; doc_idx++ ) {\r\n delete words;\r\n current_n_words = 0;\r\n\r\n for ( word_idx = 0; word_idx < n_words; word_idx++ ) {\r\n if ( rand() > sample_freq_threshold ) {\r\n # Only retain word if it passes the threshold.\r\n words[word_idx] = 1;\r\n current_n_words += 1;\r\n }\r\n }\r\n\r\n printf \"%d\", current_n_words;\r\n\r\n for ( word_idx in words ) {\r\n printf \" %d:1\", word_idx;\r\n }\r\n\r\n printf \"\\n\";\r\n }\r\n }' > \"words_${n_words}_docs_${n_docs}_sample_freq_${sample_freq}.ldac\"\r\n }\r\n\r\n generate_ldac 5000 10000 0.05","reactions":{"url":"https://api.github.com/repos/lda-project/lda/issues/128/reactions","total_count":0,"+1":0,"-1":0,"laugh":0,"hooray":0,"confused":0,"heart":0,"rocket":0,"eyes":0},"timeline_url":"https://api.github.com/repos/lda-project/lda/issues/128/timeline","performed_via_github_app":null,"state_reason":null},"comment":{"url":"https://api.github.com/repos/lda-project/lda/issues/comments/1991641433","html_url":"https://github.com/lda-project/lda/pull/128#issuecomment-1991641433","issue_url":"https://api.github.com/repos/lda-project/lda/issues/128","id":1991641433,"node_id":"IC_kwDOAWtJ6c52tglZ","user":{"login":"ghuls","id":1299177,"node_id":"MDQ6VXNlcjEyOTkxNzc=","avatar_url":"https://avatars.githubusercontent.com/u/1299177?v=4","gravatar_id":"","url":"https://api.github.com/users/ghuls","html_url":"https://github.com/ghuls","followers_url":"https://api.github.com/users/ghuls/followers","following_url":"https://api.github.com/users/ghuls/following{/other_user}","gists_url":"https://api.github.com/users/ghuls/gists{/gist_id}","starred_url":"https://api.github.com/users/ghuls/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/ghuls/subscriptions","organizations_url":"https://api.github.com/users/ghuls/orgs","repos_url":"https://api.github.com/users/ghuls/repos","events_url":"https://api.github.com/users/ghuls/events{/privacy}","received_events_url":"https://api.github.com/users/ghuls/received_events","type":"User","site_admin":false},"created_at":"2024-03-12T13:22:08Z","updated_at":"2024-03-12T13:22:08Z","author_association":"CONTRIBUTOR","body":"@ariddell It would be nice to have this in a 3.0.1 release.","reactions":{"url":"https://api.github.com/repos/lda-project/lda/issues/comments/1991641433/reactions","total_count":0,"+1":0,"-1":0,"laugh":0,"hooray":0,"confused":0,"heart":0,"rocket":0,"eyes":0},"performed_via_github_app":null}},"public":true,"created_at":"2024-03-12T13:22:10Z","org":{"id":29574507,"login":"lda-project","gravatar_id":"","url":"https://api.github.com/orgs/lda-project","avatar_url":"https://avatars.githubusercontent.com/u/29574507?"}},{"id":"36472106925","type":"IssueCommentEvent","actor":{"id":1299177,"login":"ghuls","display_login":"ghuls","gravatar_id":"","url":"https://api.github.com/users/ghuls","avatar_url":"https://avatars.githubusercontent.com/u/1299177?"},"repo":{"id":329905726,"name":"aertslab/pycisTopic","url":"https://api.github.com/repos/aertslab/pycisTopic"},"payload":{"action":"created","issue":{"url":"https://api.github.com/repos/aertslab/pycisTopic/issues/31","repository_url":"https://api.github.com/repos/aertslab/pycisTopic","labels_url":"https://api.github.com/repos/aertslab/pycisTopic/issues/31/labels{/name}","comments_url":"https://api.github.com/repos/aertslab/pycisTopic/issues/31/comments","events_url":"https://api.github.com/repos/aertslab/pycisTopic/issues/31/events","html_url":"https://github.com/aertslab/pycisTopic/issues/31","id":1231085954,"node_id":"I_kwDOE6n2Ps5JYOGC","number":31,"title":"Problems running the `compute_qc_stats()` [PERFORMANCE]","user":{"login":"maxim-h","id":22867431,"node_id":"MDQ6VXNlcjIyODY3NDMx","avatar_url":"https://avatars.githubusercontent.com/u/22867431?v=4","gravatar_id":"","url":"https://api.github.com/users/maxim-h","html_url":"https://github.com/maxim-h","followers_url":"https://api.github.com/users/maxim-h/followers","following_url":"https://api.github.com/users/maxim-h/following{/other_user}","gists_url":"https://api.github.com/users/maxim-h/gists{/gist_id}","starred_url":"https://api.github.com/users/maxim-h/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/maxim-h/subscriptions","organizations_url":"https://api.github.com/users/maxim-h/orgs","repos_url":"https://api.github.com/users/maxim-h/repos","events_url":"https://api.github.com/users/maxim-h/events{/privacy}","received_events_url":"https://api.github.com/users/maxim-h/received_events","type":"User","site_admin":false},"labels":[],"state":"closed","locked":false,"assignee":null,"assignees":[],"milestone":null,"comments":8,"created_at":"2022-05-10T12:32:12Z","updated_at":"2024-03-12T12:30:56Z","closed_at":"2023-01-31T09:03:23Z","author_association":"COLLABORATOR","active_lock_reason":null,"body":"**Describe the bug**\r\n\r\nWhen following the `Single_sample_workflow.ipynb` I get an error while executing `compute_qc_stats()`. I'm using one of the samples from [NeurIPS2021 BMMC data](https://openproblems.bio/neurips_2021/).\r\n\r\n**To Reproduce**\r\n```py\r\nfrom pycisTopic.qc import *\r\npath_to_regions = {'s1d1':outDir + 'consensus_peak_calling/consensus_regions.bed'}\r\nmetadata_bc, profile_data_dict = compute_qc_stats(fragments_dict = fragments_dict,\r\n tss_annotation = annot,\r\n stats=['barcode_rank_plot', 'duplicate_rate', 'insert_size_distribution', 'profile_tss', 'frip'],\r\n label_list = None,\r\n path_to_regions = path_to_regions,\r\n n_cpu = 12,\r\n valid_bc = None,\r\n n_frag = 100,\r\n n_bc = None,\r\n tss_flank_window = 1000,\r\n tss_window = 50,\r\n tss_minimum_signal_window = 100,\r\n tss_rolling_window = 10,\r\n remove_duplicates = True,\r\n _temp_dir = \"/scratch/kholmato/\" + 'ray_spill')\r\n```\r\n\r\n**Error output**\r\n\r\n\r\n Command output
\r\n\r\n```\r\n2022-05-10 09:28:21,105 cisTopic INFO n_cpu is larger than the number of samples. Setting n_cpu to the number of samples\r\n2022-05-10 09:28:27,947\tINFO services.py:1412 -- View the Ray dashboard at [http://127.0.0.1:8265](http://127.0.0.1:8265\r\n(raylet) loop.run_until_complete(agent.run())\r\n(raylet) File \"/home/kholmato/.micromamba/envs/scenicplus/lib/python3.8/asyncio/base_events.py\", line 616, in run_until_complete\r\n(raylet) return future.result()\r\n(raylet) File \"/home/kholmato/.micromamba/envs/scenicplus/lib/python3.8/site-packages/ray/dashboard/agent.py\", line 178, in run\r\n(raylet) modules = self._load_modules()\r\n(raylet) File \"/home/kholmato/.micromamba/envs/scenicplus/lib/python3.8/site-packages/ray/dashboard/agent.py\", line 120, in _load_modules\r\n(raylet) c = cls(self)\r\n(raylet) File \"/home/kholmato/.micromamba/envs/scenicplus/lib/python3.8/site-packages/ray/dashboard/modules/reporter/reporter_agent.py\", line 161, in __init__\r\n(raylet) self._metrics_agent = MetricsAgent(\r\n(raylet) File \"/home/kholmato/.micromamba/envs/scenicplus/lib/python3.8/site-packages/ray/_private/metrics_agent.py\", line 75, in __init__\r\n(raylet) prometheus_exporter.new_stats_exporter(\r\n(raylet) File \"/home/kholmato/.micromamba/envs/scenicplus/lib/python3.8/site-packages/ray/_private/prometheus_exporter.py\", line 332, in new_stats_exporter\r\n(raylet) exporter = PrometheusStatsExporter(\r\n(raylet) File \"/home/kholmato/.micromamba/envs/scenicplus/lib/python3.8/site-packages/ray/_private/prometheus_exporter.py\", line 265, in __init__\r\n(raylet) self.serve_http()\r\n(raylet) File \"/home/kholmato/.micromamba/envs/scenicplus/lib/python3.8/site-packages/ray/_private/prometheus_exporter.py\", line 319, in serve_http\r\n(raylet) start_http_server(\r\n(raylet) File \"/home/kholmato/.micromamba/envs/scenicplus/lib/python3.8/site-packages/prometheus_client/exposition.py\", line 168, in start_wsgi_server\r\n(raylet) TmpServer.address_family, addr = _get_best_family(addr, port)\r\n(raylet) File \"/home/kholmato/.micromamba/envs/scenicplus/lib/python3.8/site-packages/prometheus_client/exposition.py\", line 157, in _get_best_family\r\n(raylet) infos = socket.getaddrinfo(address, port)\r\n(raylet) File \"/home/kholmato/.micromamba/envs/scenicplus/lib/python3.8/socket.py\", line 918, in getaddrinfo\r\n(raylet) for res in _socket.getaddrinfo(host, port, family, type, proto, flags):\r\n(raylet) socket.gaierror: [Errno -2] Name or service not known\r\n(raylet) \r\n(raylet) During handling of the above exception, another exception occurred:\r\n(raylet) \r\n(raylet) Traceback (most recent call last):\r\n(raylet) File \"/home/kholmato/.micromamba/envs/scenicplus/lib/python3.8/site-packages/ray/dashboard/agent.py\", line 407, in \r\n(raylet) gcs_publisher = GcsPublisher(args.gcs_address)\r\n(raylet) TypeError: __init__() takes 1 positional argument but 2 were given\r\n2022-05-10 09:31:45,265\tWARNING worker.py:1326 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: c844f38e73700f0a4092937dde2f6c2a1e9317c301000000 Worker ID: 90e5694e1b68f0f6d4866e912f15492dc66870249d2681b505f79e20 Node ID: d0e2524ba25878415b949b86b56f6569d19338e5c52ec713e016e65b Worker IP address: 10.133.125.109 Worker port: 37471 Worker PID: 365\r\n(compute_qc_stats_ray pid=1047) 2022-05-10 09:32:02,008 cisTopic INFO Reading s1d1\r\n2022-05-10 09:35:07,922\tWARNING worker.py:1326 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: 5d185699b76f18f8685f983012c6b70ff6ad8df401000000 Worker ID: e350ccece30580bf12005d7cd2ef2c161257d8b3d0a2c7e6a2197a51 Node ID: d0e2524ba25878415b949b86b56f6569d19338e5c52ec713e016e65b Worker IP address: 10.133.125.109 Worker port: 40743 Worker PID: 1047\r\n(compute_qc_stats_ray pid=1126) 2022-05-10 09:35:13,087 cisTopic INFO Reading s1d1\r\n2022-05-10 09:38:18,093\tWARNING worker.py:1326 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: b23d4387bf3f51e56bf41be17b75897f9c1294ad01000000 Worker ID: 8aa66876fe55248e290ed116d9da9a48333811ca55c382d2429cd844 Node ID: d0e2524ba25878415b949b86b56f6569d19338e5c52ec713e016e65b Worker IP address: 10.133.125.109 Worker port: 45711 Worker PID: 1126\r\n2022-05-10 09:41:28,109\tWARNING worker.py:1326 -- A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: 2744be47b942047896a10447f27a3e3b5525f1f201000000 Worker ID: 667d84edea8e1a0de4ab3869c7354e844f178fa695cb592ee538868e Node ID: d0e2524ba25878415b949b86b56f6569d19338e5c52ec713e016e65b Worker IP address: 10.133.125.109 Worker port: 35070 Worker PID: 1168\r\n---------------------------------------------------------------------------\r\nWorkerCrashedError Traceback (most recent call last)\r\nInput In [17], in ()\r\n 1 from pycisTopic.qc import *\r\n 2 path_to_regions = {'s1d1':outDir + 'consensus_peak_calling/consensus_regions.bed'}\r\n----> 3 metadata_bc, profile_data_dict = compute_qc_stats(fragments_dict = fragments_dict,\r\n 4 tss_annotation = annot,\r\n 5 stats=['barcode_rank_plot', 'duplicate_rate', 'insert_size_distribution', 'profile_tss', 'frip'],\r\n 6 label_list = None,\r\n 7 path_to_regions = path_to_regions,\r\n 8 n_cpu = 12,\r\n 9 valid_bc = None,\r\n 10 n_frag = 100,\r\n 11 n_bc = None,\r\n 12 tss_flank_window = 1000,\r\n 13 tss_window = 50,\r\n 14 tss_minimum_signal_window = 100,\r\n 15 tss_rolling_window = 10,\r\n 16 remove_duplicates = True,\r\n 17 _temp_dir = \"/scratch/kholmato/\" + 'ray_spill')\r\n\r\nFile ~/.builds/scenicplus/pycisTopic/pycisTopic/qc.py:859, in compute_qc_stats(fragments_dict, tss_annotation, stats, label_list, path_to_regions, n_cpu, partition, valid_bc, n_frag, n_bc, tss_flank_window, tss_window, tss_minimum_signal_window, tss_rolling_window, min_norm, check_for_duplicates, remove_duplicates, **kwargs)\r\n 856 n_cpu = len(fragments_list)\r\n 858 ray.init(num_cpus=n_cpu, **kwargs)\r\n--> 859 qc_stats = ray.get(\r\n 860 [\r\n 861 compute_qc_stats_ray.remote(\r\n 862 fragments_list[i],\r\n 863 tss_annotation=tss_annotation,\r\n 864 stats=stats,\r\n 865 label=label_list[i],\r\n 866 path_to_regions=path_to_regions[i],\r\n 867 valid_bc=valid_bc,\r\n 868 n_frag=n_frag,\r\n 869 n_bc=n_bc,\r\n 870 tss_flank_window=tss_flank_window,\r\n 871 tss_window=tss_window,\r\n 872 tss_minimum_signal_window=tss_minimum_signal_window,\r\n 873 tss_rolling_window=tss_rolling_window,\r\n 874 min_norm=min_norm,\r\n 875 partition=partition,\r\n 876 check_for_duplicates=check_for_duplicates,\r\n 877 remove_duplicates=remove_duplicates)\r\n 878 for i in range(len(fragments_list))\r\n 879 ]\r\n 880 )\r\n 881 ray.shutdown()\r\n 882 metadata_dict = {key: x[key] for x in list(\r\n 883 list(zip(*qc_stats))[0]) for key in x.keys()}\r\n\r\n\r\nFile ~/.micromamba/envs/scenicplus/lib/python3.8/site-packages/ray/_private/client_mode_hook.py:105, in client_mode_hook..wrapper(*args, **kwargs)\r\n 103 if func.__name__ != \"init\" or is_client_mode_enabled_by_default:\r\n 104 return getattr(ray, func.__name__)(*args, **kwargs)\r\n--> 105 return func(*args, **kwargs)\r\n\r\nFile ~/.micromamba/envs/scenicplus/lib/python3.8/site-packages/ray/worker.py:1765, in get(object_refs, timeout)\r\n 1763 raise value.as_instanceof_cause()\r\n 1764 else:\r\n-> 1765 raise value\r\n 1767 if is_individual_id:\r\n 1768 values = values[0]\r\n\r\nWorkerCrashedError: The worker died unexpectedly while executing this task. Check python-core-worker-*.log files for more information.\r\n```\r\n\r\n | \r\n\r\n**Expected behavior**\r\nNot an error)\r\n\r\n**Version (please complete the following information):**\r\n - Python: '3.8.13 | packaged by conda-forge | (default, Mar 25 2022, 06:04:18) [GCC 10.3.0]'\r\n \r\n pip freeze
\r\n\r\nadjustText==0.7.3\r\naiohttp==3.8.1\r\naiohttp-cors==0.7.0\r\naioredis==1.3.1\r\naiosignal==1.2.0\r\nanndata==0.7.8\r\nannoy==1.17.0\r\nappdirs==1.4.4\r\narboreto==0.1.6\r\nargon2-cffi==21.3.0\r\nargon2-cffi-bindings==21.2.0\r\nasttokens==2.0.5\r\nasync-timeout==4.0.2\r\nattr==0.3.1\r\nattrs==21.4.0\r\nbackcall==0.2.0\r\nbbknn==1.5.1\r\nbeautifulsoup4==4.11.1\r\nbioservices==1.8.4\r\nbleach==5.0.0\r\nblessed==1.19.1\r\nbokeh==2.4.2\r\nboltons==21.0.0\r\ncachetools==5.0.0\r\ncattrs==1.10.0\r\ncertifi==2021.10.8\r\ncffi==1.15.0\r\ncharset-normalizer==2.0.12\r\nclick==8.1.2\r\ncloudpickle==2.0.0\r\ncolorama==0.4.4\r\ncolorful==0.5.4\r\ncolorlog==6.6.0\r\ncryptography==36.0.2\r\nctxcore==0.1.1\r\ncycler==0.11.0\r\nCython==0.29.28\r\ncytoolz==0.11.2\r\ndask==2022.4.0\r\ndataclasses-json==0.5.7\r\ndebugpy==1.6.0\r\ndecorator==5.1.1\r\ndefusedxml==0.7.1\r\nDeprecated==1.2.13\r\ndeprecation==2.1.0\r\ndill==0.3.4\r\ndistributed==2022.4.0\r\ndocutils==0.18.1\r\neasydev==0.12.0\r\nentrypoints==0.4\r\nexecuting==0.8.3\r\nfastjsonschema==2.15.3\r\nfbpca==1.0\r\nfilelock==3.6.0\r\nfonttools==4.32.0\r\nfrozendict==2.3.1\r\nfrozenlist==1.3.0\r\nfsspec==2022.3.0\r\nfuture==0.18.2\r\ngensim==4.1.2\r\ngeosketch==1.2\r\ngermalemma==0.1.3\r\ngevent==21.12.0\r\nglobre==0.1.5\r\ngoogle-api-core==2.7.2\r\ngoogle-auth==2.6.4\r\ngoogleapis-common-protos==1.56.0\r\ngpustat==1.0.0b1\r\ngreenlet==1.1.2\r\ngrequests==0.6.0\r\ngrpcio==1.43.0\r\ngseapy==0.10.8\r\nh5py==3.6.0\r\nharmonypy==0.0.5\r\nHeapDict==1.0.1\r\nhiredis==2.0.0\r\nidna==3.3\r\nigraph==0.9.10\r\nimageio==2.16.2\r\nimportlib-resources==5.7.0\r\ninterlap==0.2.7\r\nintervaltree==3.1.0\r\nipykernel==6.13.0\r\nipympl==0.9.0\r\nipython==8.2.0\r\nipython-genutils==0.2.0\r\nipywidgets==7.7.0\r\njedi==0.18.1\r\nJinja2==3.1.1\r\njoblib==1.1.0\r\njsonschema==4.4.0\r\njupyter-client==7.2.2\r\njupyter-core==4.9.2\r\njupyterlab-pygments==0.2.2\r\njupyterlab-widgets==1.1.0\r\nkiwisolver==1.4.2\r\nlda==2.0.0\r\nleidenalg==0.8.9\r\nllvmlite==0.38.0\r\nlocket==0.2.1\r\nloompy==3.0.7\r\nloomxpy==0.4.1\r\nlxml==4.8.0\r\nMACS2 @ file:///opt/conda/conda-bld/macs2_1645526782437/work\r\nMarkupSafe==2.1.1\r\nmarshmallow==3.15.0\r\nmarshmallow-enum==1.5.1\r\nmatplotlib==3.5.1\r\nmatplotlib-inline==0.1.3\r\nmistune==0.8.4\r\nmsgpack==1.0.3\r\nmudata==0.1.1\r\nmultidict==6.0.2\r\nmultiprocessing-on-dill==3.5.0a4\r\nmuon==0.1.2\r\nmypy-extensions==0.4.3\r\nnatsort==8.1.0\r\nnbclient==0.6.0\r\nnbconvert==6.5.0\r\nnbformat==5.3.0\r\nncls==0.0.64\r\nnest-asyncio==1.5.5\r\nnetworkx==2.8\r\nnltk==3.7\r\nnotebook==6.4.10\r\nnumba==0.55.1\r\nnumexpr==2.8.1\r\nnumpy @ file:///home/conda/feedstock_root/build_artifacts/numpy_1649572648093/work\r\nnumpy-groupies==0.9.14\r\nnvidia-ml-py3==7.352.0\r\nopencensus==0.8.0\r\nopencensus-context==0.1.2\r\npackaging==21.3\r\npandas==1.4.2\r\npandocfilters==1.5.0\r\nparso==0.8.3\r\npartd==1.2.0\r\npatsy==0.5.2\r\nPatternLite==3.6\r\npbr==3.1.1\r\npexpect==4.8.0\r\npickleshare==0.7.5\r\nPillow==9.1.0\r\nprometheus-client==0.14.1\r\nprompt-toolkit==3.0.29\r\nprotobuf==3.20.0\r\npsutil==5.9.0\r\nptyprocess==0.7.0\r\npure-eval==0.2.2\r\npy-spy==0.3.11\r\npyarrow==0.16.0\r\npyasn1==0.4.8\r\npyasn1-modules==0.2.8\r\npybedtools==0.9.0\r\npyBigWig==0.3.18\r\npybiomart==0.2.0\r\n-e git+ssh://git@github.com/aertslab/pycistarget.git@d2571bf92579addfc8a21465bfbcd0de86aedc61#egg=pycistarget\r\n-e git+ssh://git@github.com/aertslab/pycisTopic.git@ddb3caf583312f1169776a21650bc47ff4daebe6#egg=pycisTopic\r\npycparser==2.21\r\npyfasta==0.5.2\r\nPygments==2.11.2\r\npynndescent==0.5.6\r\npyOpenSSL==22.0.0\r\npyparsing==3.0.8\r\npyphen==0.12.0\r\npyranges==0.0.115\r\npyrle==0.0.34\r\npyrsistent==0.18.1\r\npysam==0.19.0\r\npyscenic==0.11.2\r\npython-dateutil==2.8.2\r\npython-igraph==0.9.10\r\npython-Levenshtein==0.12.2\r\npytz==2022.1\r\nPyWavelets==1.3.0\r\nPyYAML==6.0\r\npyzmq==22.3.0\r\nray==1.11.0\r\nredis==4.2.2\r\nregex==2022.3.15\r\nrequests==2.27.1\r\nrequests-cache==0.9.3\r\nrsa==4.8\r\nscanorama==1.7.2\r\nscanpy==1.9.1\r\n-e git+ssh://git@github.com/aertslab/scenicplus.git@6faa2d62416e3c78aeb45eead5c70cdafecb762f#egg=scenicplus\r\nscikit-image==0.19.2\r\nscikit-learn==0.24.2\r\nscipy==1.8.0\r\nscrublet==0.2.3\r\nseaborn==0.11.2\r\nSend2Trash==1.8.0\r\nsession-info==1.0.0\r\nsix==1.16.0\r\nsklearn==0.0\r\nsmart-open==5.2.1\r\nsorted-nearest==0.0.33\r\nsortedcontainers==2.4.0\r\nsoupsieve==2.3.2.post1\r\nstack-data==0.2.0\r\nstatistics==1.0.3.5\r\nstatsmodels==0.13.2\r\nstdlib-list==0.8.0\r\nsuds-community==1.1.0\r\ntables==3.7.0\r\ntabulate==0.8.9\r\ntblib==1.7.0\r\nterminado==0.13.3\r\ntexttable==1.6.4\r\nthreadpoolctl==3.1.0\r\ntifffile==2022.4.8\r\ntinycss2==1.1.1\r\ntmtoolkit==0.9.0\r\ntoolz==0.11.2\r\ntornado==6.1\r\ntqdm==4.64.0\r\ntraitlets==5.1.1\r\ntyping==3.7.4.3\r\ntyping-inspect==0.7.1\r\ntyping_extensions==4.1.1\r\numap-learn==0.5.3\r\nurl-normalize==1.4.3\r\nurllib3==1.26.9\r\nwcwidth==0.2.5\r\nwebencodings==0.5.1\r\nwidgetsnbextension==3.6.0\r\nwrapt==1.14.0\r\nxlrd==1.2.0\r\nxmltodict==0.12.0\r\nyarl==1.7.2\r\nzict==2.1.0\r\nzipp==3.8.0\r\nzope.event==4.5.0\r\nzope.interface==5.4.0\r\n\r\n \r\n\r\n**Additional context**\r\nI'm running in a JupyterHub instance hosted at our institute. The whole working directory is on an NFS.\r\n","reactions":{"url":"https://api.github.com/repos/aertslab/pycisTopic/issues/31/reactions","total_count":0,"+1":0,"-1":0,"laugh":0,"hooray":0,"confused":0,"heart":0,"rocket":0,"eyes":0},"timeline_url":"https://api.github.com/repos/aertslab/pycisTopic/issues/31/timeline","performed_via_github_app":null,"state_reason":"completed"},"comment":{"url":"https://api.github.com/repos/aertslab/pycisTopic/issues/comments/1991544199","html_url":"https://github.com/aertslab/pycisTopic/issues/31#issuecomment-1991544199","issue_url":"https://api.github.com/repos/aertslab/pycisTopic/issues/31","id":1991544199,"node_id":"IC_kwDOE6n2Ps52tI2H","user":{"login":"ghuls","id":1299177,"node_id":"MDQ6VXNlcjEyOTkxNzc=","avatar_url":"https://avatars.githubusercontent.com/u/1299177?v=4","gravatar_id":"","url":"https://api.github.com/users/ghuls","html_url":"https://github.com/ghuls","followers_url":"https://api.github.com/users/ghuls/followers","following_url":"https://api.github.com/users/ghuls/following{/other_user}","gists_url":"https://api.github.com/users/ghuls/gists{/gist_id}","starred_url":"https://api.github.com/users/ghuls/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/ghuls/subscriptions","organizations_url":"https://api.github.com/users/ghuls/orgs","repos_url":"https://api.github.com/users/ghuls/repos","events_url":"https://api.github.com/users/ghuls/events{/privacy}","received_events_url":"https://api.github.com/users/ghuls/received_events","type":"User","site_admin":false},"created_at":"2024-03-12T12:30:55Z","updated_at":"2024-03-12T12:30:55Z","author_association":"MEMBER","body":"Speed improvements in QC calculation are in the polars branch:\r\nhttps://pycistopic.readthedocs.io/en/polars/notebooks/human_cerebellum.html#QC","reactions":{"url":"https://api.github.com/repos/aertslab/pycisTopic/issues/comments/1991544199/reactions","total_count":0,"+1":0,"-1":0,"laugh":0,"hooray":0,"confused":0,"heart":0,"rocket":0,"eyes":0},"performed_via_github_app":null}},"public":true,"created_at":"2024-03-12T12:30:56Z","org":{"id":3940817,"login":"aertslab","gravatar_id":"","url":"https://api.github.com/orgs/aertslab","avatar_url":"https://avatars.githubusercontent.com/u/3940817?"}},{"id":"36471982073","type":"IssueCommentEvent","actor":{"id":1299177,"login":"ghuls","display_login":"ghuls","gravatar_id":"","url":"https://api.github.com/users/ghuls","avatar_url":"https://avatars.githubusercontent.com/u/1299177?"},"repo":{"id":329905726,"name":"aertslab/pycisTopic","url":"https://api.github.com/repos/aertslab/pycisTopic"},"payload":{"action":"created","issue":{"url":"https://api.github.com/repos/aertslab/pycisTopic/issues/90","repository_url":"https://api.github.com/repos/aertslab/pycisTopic","labels_url":"https://api.github.com/repos/aertslab/pycisTopic/issues/90/labels{/name}","comments_url":"https://api.github.com/repos/aertslab/pycisTopic/issues/90/comments","events_url":"https://api.github.com/repos/aertslab/pycisTopic/issues/90/events","html_url":"https://github.com/aertslab/pycisTopic/issues/90","id":1868841890,"node_id":"I_kwDOE6n2Ps5vZEOi","number":90,"title":"`valid_bc` argument for `barcode_rank_function` has unanticipated behavior","user":{"login":"adamklie","id":23622847,"node_id":"MDQ6VXNlcjIzNjIyODQ3","avatar_url":"https://avatars.githubusercontent.com/u/23622847?v=4","gravatar_id":"","url":"https://api.github.com/users/adamklie","html_url":"https://github.com/adamklie","followers_url":"https://api.github.com/users/adamklie/followers","following_url":"https://api.github.com/users/adamklie/following{/other_user}","gists_url":"https://api.github.com/users/adamklie/gists{/gist_id}","starred_url":"https://api.github.com/users/adamklie/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/adamklie/subscriptions","organizations_url":"https://api.github.com/users/adamklie/orgs","repos_url":"https://api.github.com/users/adamklie/repos","events_url":"https://api.github.com/users/adamklie/events{/privacy}","received_events_url":"https://api.github.com/users/adamklie/received_events","type":"User","site_admin":false},"labels":[],"state":"closed","locked":false,"assignee":null,"assignees":[],"milestone":null,"comments":5,"created_at":"2023-08-28T02:27:28Z","updated_at":"2024-03-12T12:27:27Z","closed_at":"2023-09-27T13:18:06Z","author_association":"COLLABORATOR","active_lock_reason":null,"body":"**Describe the bug**\r\nNot 100% sure if this is a bug, but wanted to bring it up. The `barcode_rank_plot` function has a parameter for `valid_bc` which I interpreted as a place to input a list of string barcodes to consider for the plot. Looking at the code for this function, it looks like it is just used to select the top number of barcodes for plot based on fragment number, similar to giving the actual number in the `n_bc` argument. \r\n\r\nNormally, this would be pretty innocuous, but this did some confusing things for me when I used it as part of the `compute_qc_stats_single` function. This is because that function calls `barcode_rank_plot` and uses the return data downstream. This return data contains the top `len(valid_bc)` barcodes in terms of number of fragments, rather than the actual barcodes I pass in. When the following code is then run:\r\n\r\n```python\r\nif valid_bc is None:\r\n fragments_df = fragments_df[\r\n fragments_df.Name.isin(set(metrics[\"barcode_rank_plot\"][\"valid_bc\"]))\r\n ]\r\n```\r\nThe `fragments_df` is subset to down to cells that may not be contained in the initial `valid_bc` input (in fact in my case it only contained 12 of those cells and ~2000 other cell barcodes). All the downstream QCs that take in `fragments_df` now end up reporting metrics on a completely different set of barcodes.\r\n\r\n**Expected behavior**\r\nThinking about it, it probably doesn't make sense to pass a list of string barcodes into the barcode rank plot since they likely will be out of order if they were selected by some criteria other than number of fragments they contain. I actually think the root cause of my confusion is with some mislabeled \"high quality\" cells (hence only 12 barcodes showing up in the top ~2000 ranked by fragments), so I'm guessing most users won't actually need to worry about this, but in my case it led to more confusion instead of clarity.\r\n\r\nThe simple fix for the downstream QCs would seem to be to change the above code to the following:\r\n```python\r\nif valid_bc is None:\r\n fragments_df = fragments_df[\r\n fragments_df.Name.isin(set(valid_bc))\r\n ]\r\n```\r\nThis would do the filtering on the desired barcodes instead of just on the top ones sorted by number of fragments.\r\n\r\nHopefully this makes sense! Love the package 😃 ","reactions":{"url":"https://api.github.com/repos/aertslab/pycisTopic/issues/90/reactions","total_count":0,"+1":0,"-1":0,"laugh":0,"hooray":0,"confused":0,"heart":0,"rocket":0,"eyes":0},"timeline_url":"https://api.github.com/repos/aertslab/pycisTopic/issues/90/timeline","performed_via_github_app":null,"state_reason":"completed"},"comment":{"url":"https://api.github.com/repos/aertslab/pycisTopic/issues/comments/1991537863","html_url":"https://github.com/aertslab/pycisTopic/issues/90#issuecomment-1991537863","issue_url":"https://api.github.com/repos/aertslab/pycisTopic/issues/90","id":1991537863,"node_id":"IC_kwDOE6n2Ps52tHTH","user":{"login":"ghuls","id":1299177,"node_id":"MDQ6VXNlcjEyOTkxNzc=","avatar_url":"https://avatars.githubusercontent.com/u/1299177?v=4","gravatar_id":"","url":"https://api.github.com/users/ghuls","html_url":"https://github.com/ghuls","followers_url":"https://api.github.com/users/ghuls/followers","following_url":"https://api.github.com/users/ghuls/following{/other_user}","gists_url":"https://api.github.com/users/ghuls/gists{/gist_id}","starred_url":"https://api.github.com/users/ghuls/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/ghuls/subscriptions","organizations_url":"https://api.github.com/users/ghuls/orgs","repos_url":"https://api.github.com/users/ghuls/repos","events_url":"https://api.github.com/users/ghuls/events{/privacy}","received_events_url":"https://api.github.com/users/ghuls/received_events","type":"User","site_admin":false},"created_at":"2024-03-12T12:27:26Z","updated_at":"2024-03-12T12:27:26Z","author_association":"MEMBER","body":"@adamklie You can test it now if you want.\r\nFor now you still need to install pycisTopic from the polars branch till we switch the main branch:\r\nhttps://pycistopic.readthedocs.io/en/polars/notebooks/human_cerebellum.html#QC\r\n\r\nLet us know how it goes.","reactions":{"url":"https://api.github.com/repos/aertslab/pycisTopic/issues/comments/1991537863/reactions","total_count":0,"+1":0,"-1":0,"laugh":0,"hooray":0,"confused":0,"heart":0,"rocket":0,"eyes":0},"performed_via_github_app":null}},"public":true,"created_at":"2024-03-12T12:27:27Z","org":{"id":3940817,"login":"aertslab","gravatar_id":"","url":"https://api.github.com/orgs/aertslab","avatar_url":"https://avatars.githubusercontent.com/u/3940817?"}},{"id":"36329670210","type":"IssueCommentEvent","actor":{"id":1299177,"login":"ghuls","display_login":"ghuls","gravatar_id":"","url":"https://api.github.com/users/ghuls","avatar_url":"https://avatars.githubusercontent.com/u/1299177?"},"repo":{"id":391121521,"name":"aertslab/scenicplus","url":"https://api.github.com/repos/aertslab/scenicplus"},"payload":{"action":"created","issue":{"url":"https://api.github.com/repos/aertslab/scenicplus/issues/303","repository_url":"https://api.github.com/repos/aertslab/scenicplus","labels_url":"https://api.github.com/repos/aertslab/scenicplus/issues/303/labels{/name}","comments_url":"https://api.github.com/repos/aertslab/scenicplus/issues/303/comments","events_url":"https://api.github.com/repos/aertslab/scenicplus/issues/303/events","html_url":"https://github.com/aertslab/scenicplus/issues/303","id":2144773473,"node_id":"I_kwDOF1AKcc5_1qVh","number":303,"title":"RuntimeError from pbmc tutorial ( export_pseudobulk)","user":{"login":"Tu4n-ph4m","id":123585899,"node_id":"U_kgDOB13Faw","avatar_url":"https://avatars.githubusercontent.com/u/123585899?v=4","gravatar_id":"","url":"https://api.github.com/users/Tu4n-ph4m","html_url":"https://github.com/Tu4n-ph4m","followers_url":"https://api.github.com/users/Tu4n-ph4m/followers","following_url":"https://api.github.com/users/Tu4n-ph4m/following{/other_user}","gists_url":"https://api.github.com/users/Tu4n-ph4m/gists{/gist_id}","starred_url":"https://api.github.com/users/Tu4n-ph4m/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/Tu4n-ph4m/subscriptions","organizations_url":"https://api.github.com/users/Tu4n-ph4m/orgs","repos_url":"https://api.github.com/users/Tu4n-ph4m/repos","events_url":"https://api.github.com/users/Tu4n-ph4m/events{/privacy}","received_events_url":"https://api.github.com/users/Tu4n-ph4m/received_events","type":"User","site_admin":false},"labels":[{"id":3214941548,"node_id":"MDU6TGFiZWwzMjE0OTQxNTQ4","url":"https://api.github.com/repos/aertslab/scenicplus/labels/question","name":"question","color":"d876e3","default":true,"description":"Further information is requested"}],"state":"open","locked":false,"assignee":null,"assignees":[],"milestone":null,"comments":3,"created_at":"2024-02-20T16:24:16Z","updated_at":"2024-03-07T09:37:48Z","closed_at":null,"author_association":"NONE","active_lock_reason":null,"body":" Sorryq for the duplicate threads. I wasn't sure where I should post the issue so I just went back and forth between scenicplus and pycistopic but then I think sceniplus might be of help for more people so here it is.\r\n \r\nDescribe the bug\r\nHi there, I'm trying to re-run the pbmc tutorial but it doesn't seem to work on my end.\r\n\r\nThe command I tried:\r\n\r\n```python\r\n\r\nfrom pycisTopic.pseudobulk_peak_calling import export_pseudobulk\r\nbw_paths, bed_paths = export_pseudobulk(input_data = cell_data,\r\nvariable = 'celltype', # variable by which to generate pseubulk profiles, in this case we want pseudobulks per celltype\r\nsample_id_col = 'sample_id',\r\nchromsizes = chromsizes,\r\nbed_path = os.path.join(work_dir, 'scATAC/consensus_peak_calling/pseudobulk_bed_files/'), # specify where pseudobulk_bed_files should be stored\r\nbigwig_path = os.path.join(work_dir, 'scATAC/consensus_peak_calling/pseudobulk_bw_files/'),# specify where pseudobulk_bw_files should be stored\r\npath_to_fragments = fragments_dict, # location of fragment fiels\r\nn_cpu = 8, # specify the number of cores to use, we use ray for multi processing\r\nnormalize_bigwig = True,\r\ntemp_dir = os.path.join(tmp_dir, 'ray_spill'),\r\nsplit_pattern = '-')\r\n\r\n```\r\n\r\nHere's the error I have:\r\n\r\n```python\r\n\r\nRuntimeError: You must provide a valid set of entries. These can be comprised of any of the following:\r\n\r\nA list of each of chromosomes, start positions, end positions and values.\r\nA list of each of start positions and values. Also, a chromosome and span must be specified.\r\nA list values, in which case a single chromosome, start position, span and step must be specified.\r\nI have tried getting the index file, commenting out remove duplicates but they don't seem to work\r\n\r\n```\r\n\r\nFull error message is as follows:\r\n\r\n```python\r\n\r\n2024-02-20 10:15:09,203 cisTopic INFO Splitting fragments by cell type.\r\n2024-02-20 10:15:42,588 cisTopic INFO generating bigwig files\r\n_RemoteTraceback Traceback (most recent call last)\r\n_RemoteTraceback:\r\n\"\"\"\r\nTraceback (most recent call last):\r\nFile \"/users/tpham43/.local/lib/python3.8/site-packages/joblib/externals/loky/process_executor.py\", line 428, in _process_worker\r\nr = call_item()\r\nFile \"/users/tpham43/.local/lib/python3.8/site-packages/joblib/externals/loky/process_executor.py\", line 275, in call\r\nreturn self.fn(*self.args, **self.kwargs)\r\nFile \"/users/tpham43/.local/lib/python3.8/site-packages/joblib/_parallel_backends.py\", line 620, in call\r\nreturn self.func(*args, **kwargs)\r\nFile \"/users/tpham43/.local/lib/python3.8/site-packages/joblib/parallel.py\", line 288, in call\r\nreturn [func(*args, **kwargs)\r\nFile \"/users/tpham43/.local/lib/python3.8/site-packages/joblib/parallel.py\", line 288, in\r\nreturn [func(*args, **kwargs)\r\nFile \"/users/tpham43/.conda/envs/scenicplus/lib/python3.8/site-packages/pycisTopic/pseudobulk_peak_calling.py\", line 33, in _generate_bigwig\r\nfragments_to_bw(\r\nFile \"/users/tpham43/.conda/envs/scenicplus/lib/python3.8/site-packages/scatac_fragment_tools/library/bigwig/fragments_to_bigwig.py\", line 566, in fragments_to_bw\r\nfragments_to_bw_with_pybigwig(\r\nFile \"/users/tpham43/.conda/envs/scenicplus/lib/python3.8/site-packages/scatac_fragment_tools/library/bigwig/fragments_to_bigwig.py\", line 464, in fragments_to_bw_with_pybigwig\r\nbw.addEntries(chroms=chroms, starts=starts, ends=ends, values=values)\r\nRuntimeError: You must provide a valid set of entries. These can be comprised of any of the following:\r\n\r\nA list of each of chromosomes, start positions, end positions and values.\r\nA list of each of start positions and values. Also, a chromosome and span must be specified.\r\nA list values, in which case a single chromosome, start position, span and step must be specified.\r\n\"\"\"\r\n\r\nThe above exception was the direct cause of the following exception:\r\n\r\nRuntimeError Traceback (most recent call last)\r\nCell In[26], line 2\r\n1 from pycisTopic.pseudobulk_peak_calling import export_pseudobulk\r\n----> 2 bw_paths, bed_paths = export_pseudobulk(input_data = cell_data,\r\n3 variable = 'celltype', # variable by which to generate pseubulk profiles, in this case we want pseudobulks per celltype\r\n4 sample_id_col = 'sample_id',\r\n5 chromsizes = chromsizes,\r\n6 bed_path = os.path.join(work_dir, 'scATAC/consensus_peak_calling/pseudobulk_bed_files/'), # specify where pseudobulk_bed_files should be stored\r\n7 bigwig_path = os.path.join(work_dir, 'scATAC/consensus_peak_calling/pseudobulk_bw_files/'),# specify where pseudobulk_bw_files should be stored\r\n8 path_to_fragments = fragments_dict, # location of fragment fiels\r\n9 n_cpu = 8, # specify the number of cores to use, we use ray for multi processing\r\n10 normalize_bigwig = True,\r\n11 # remove_duplicates = True,\r\n12 temp_dir = os.path.join(tmp_dir, 'ray_spill'),\r\n13 split_pattern = '-')\r\n\r\nFile ~/.conda/envs/scenicplus/lib/python3.8/site-packages/pycisTopic/pseudobulk_peak_calling.py:178, in export_pseudobulk(input_data, variable, chromsizes, bed_path, bigwig_path, path_to_fragments, sample_id_col, n_cpu, normalize_bigwig, split_pattern, temp_dir)\r\n175 log.warning(f\"Missing fragments for {cell_type}!\")\r\n177 log.info(\"generating bigwig files\")\r\n--> 178 joblib.Parallel(n_jobs=n_cpu)(\r\n179 joblib.delayed(_generate_bigwig)\r\n180 (\r\n181 path_to_fragments = bed_paths[cell_type],\r\n182 chromsizes = chromsizes_dict,\r\n183 normalize_bigwig = normalize_bigwig,\r\n184 bw_filename = os.path.join(bigwig_path, f\"{_santize_string_for_filename(cell_type)}.bw\"),\r\n185 log = log\r\n186 )\r\n187 for cell_type in bed_paths.keys()\r\n188 )\r\n189 bw_paths = {}\r\n190 for cell_type in cell_data[variable].unique():\r\n\r\nFile ~/.local/lib/python3.8/site-packages/joblib/parallel.py:1098, in Parallel.call(self, iterable)\r\n1095 self._iterating = False\r\n1097 with self._backend.retrieval_context():\r\n-> 1098 self.retrieve()\r\n1099 # Make sure that we get a last message telling us we are done\r\n1100 elapsed_time = time.time() - self._start_time\r\n\r\nFile ~/.local/lib/python3.8/site-packages/joblib/parallel.py:975, in Parallel.retrieve(self)\r\n973 try:\r\n974 if getattr(self._backend, 'supports_timeout', False):\r\n--> 975 self._output.extend(job.get(timeout=self.timeout))\r\n976 else:\r\n977 self._output.extend(job.get())\r\n\r\nFile ~/.local/lib/python3.8/site-packages/joblib/_parallel_backends.py:567, in LokyBackend.wrap_future_result(future, timeout)\r\n564 \"\"\"Wrapper for Future.result to implement the same behaviour as\r\n565 AsyncResults.get from multiprocessing.\"\"\"\r\n566 try:\r\n--> 567 return future.result(timeout=timeout)\r\n568 except CfTimeoutError as e:\r\n569 raise TimeoutError from e\r\n\r\nFile ~/.conda/envs/scenicplus/lib/python3.8/concurrent/futures/_base.py:444, in Future.result(self, timeout)\r\n442 raise CancelledError()\r\n443 elif self._state == FINISHED:\r\n--> 444 return self.__get_result()\r\n445 else:\r\n446 raise TimeoutError()\r\n\r\nFile ~/.conda/envs/scenicplus/lib/python3.8/concurrent/futures/_base.py:389, in Future.__get_result(self)\r\n387 if self._exception:\r\n388 try:\r\n--> 389 raise self._exception\r\n390 finally:\r\n391 # Break a reference cycle with the exception in self._exception\r\n392 self = None\r\n\r\nRuntimeError: You must provide a valid set of entries. These can be comprised of any of the following:\r\n\r\nA list of each of chromosomes, start positions, end positions and values.\r\nA list of each of start positions and values. Also, a chromosome and span must be specified.\r\nA list values, in which case a single chromosome, start position, span and step must be specified.\r\n\r\n```\r\n","reactions":{"url":"https://api.github.com/repos/aertslab/scenicplus/issues/303/reactions","total_count":0,"+1":0,"-1":0,"laugh":0,"hooray":0,"confused":0,"heart":0,"rocket":0,"eyes":0},"timeline_url":"https://api.github.com/repos/aertslab/scenicplus/issues/303/timeline","performed_via_github_app":null,"state_reason":null},"comment":{"url":"https://api.github.com/repos/aertslab/scenicplus/issues/comments/1983103410","html_url":"https://github.com/aertslab/scenicplus/issues/303#issuecomment-1983103410","issue_url":"https://api.github.com/repos/aertslab/scenicplus/issues/303","id":1983103410,"node_id":"IC_kwDOF1AKcc52M8Gy","user":{"login":"ghuls","id":1299177,"node_id":"MDQ6VXNlcjEyOTkxNzc=","avatar_url":"https://avatars.githubusercontent.com/u/1299177?v=4","gravatar_id":"","url":"https://api.github.com/users/ghuls","html_url":"https://github.com/ghuls","followers_url":"https://api.github.com/users/ghuls/followers","following_url":"https://api.github.com/users/ghuls/following{/other_user}","gists_url":"https://api.github.com/users/ghuls/gists{/gist_id}","starred_url":"https://api.github.com/users/ghuls/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/ghuls/subscriptions","organizations_url":"https://api.github.com/users/ghuls/orgs","repos_url":"https://api.github.com/users/ghuls/repos","events_url":"https://api.github.com/users/ghuls/events{/privacy}","received_events_url":"https://api.github.com/users/ghuls/received_events","type":"User","site_admin":false},"created_at":"2024-03-07T09:37:46Z","updated_at":"2024-03-07T09:37:46Z","author_association":"MEMBER","body":"Could you try running the commandline version: `scatac_fragment_tools bigwig` on your fragment files?\r\nhttps://aertslab.github.io/scatac_fragment_tools/bigwig.html","reactions":{"url":"https://api.github.com/repos/aertslab/scenicplus/issues/comments/1983103410/reactions","total_count":0,"+1":0,"-1":0,"laugh":0,"hooray":0,"confused":0,"heart":0,"rocket":0,"eyes":0},"performed_via_github_app":null}},"public":true,"created_at":"2024-03-07T09:37:48Z","org":{"id":3940817,"login":"aertslab","gravatar_id":"","url":"https://api.github.com/orgs/aertslab","avatar_url":"https://avatars.githubusercontent.com/u/3940817?"}},{"id":"36230892065","type":"PushEvent","actor":{"id":1299177,"login":"ghuls","display_login":"ghuls","gravatar_id":"","url":"https://api.github.com/users/ghuls","avatar_url":"https://avatars.githubusercontent.com/u/1299177?"},"repo":{"id":310551663,"name":"aertslab/single_cell_toolkit","url":"https://api.github.com/repos/aertslab/single_cell_toolkit"},"payload":{"repository_id":310551663,"push_id":17381083648,"size":1,"distinct_size":1,"ref":"refs/heads/master","head":"6183aa4a56cf3a9b1b8071447d288be82ac2e475","before":"b13c1941afd8bbd38d99ec30507b1abe54bfcea6","commits":[{"sha":"6183aa4a56cf3a9b1b8071447d288be82ac2e475","author":{"email":"gert.hulselmans@kuleuven.be","name":"Gert Hulselmans"},"message":"Add new create_fragments_file that uses noodels instead of rust_htslib.\n\nAdd new create_fragments_file that uses noodels instead of rust_htslib.\nArgument handling is now provided by Clap and output fragments file\nis BGZF-compressed by noodles instead of needing to rely on external\nbgzip compression. The number of BAN deompression and fragments file\ncompression threads are optimised for the fastest runtime with the\nleast amount of CPU overhead.","distinct":true,"url":"https://api.github.com/repos/aertslab/single_cell_toolkit/commits/6183aa4a56cf3a9b1b8071447d288be82ac2e475"}]},"public":true,"created_at":"2024-03-04T17:10:52Z","org":{"id":3940817,"login":"aertslab","gravatar_id":"","url":"https://api.github.com/orgs/aertslab","avatar_url":"https://avatars.githubusercontent.com/u/3940817?"}},{"id":"36230759584","type":"PushEvent","actor":{"id":1299177,"login":"ghuls","display_login":"ghuls","gravatar_id":"","url":"https://api.github.com/users/ghuls","avatar_url":"https://avatars.githubusercontent.com/u/1299177?"},"repo":{"id":310551663,"name":"aertslab/single_cell_toolkit","url":"https://api.github.com/repos/aertslab/single_cell_toolkit"},"payload":{"repository_id":310551663,"push_id":17381022676,"size":2,"distinct_size":2,"ref":"refs/heads/master","head":"b13c1941afd8bbd38d99ec30507b1abe54bfcea6","before":"c47187d2a5dcff79166d5dd813af5ee8643ee0c5","commits":[{"sha":"f67d0f5ab5bcf03b56f6c122707d6a1b75bb0cb6","author":{"email":"gert.hulselmans@kuleuven.be","name":"Gert Hulselmans"},"message":"Rename create_fragments_file to create_fragments_file_htslib.\n\nRename create_fragments_file to create_fragments_file_htslib\nin anticipation of a new create_fragments_file that uses noodles\ninstead of rust_htslib.","distinct":true,"url":"https://api.github.com/repos/aertslab/single_cell_toolkit/commits/f67d0f5ab5bcf03b56f6c122707d6a1b75bb0cb6"},{"sha":"b13c1941afd8bbd38d99ec30507b1abe54bfcea6","author":{"email":"gert.hulselmans@kuleuven.be","name":"Gert Hulselmans"},"message":"Add new create_fragments_file that uses noodels instead of rust_htslib.\n\nAdd new create_fragments_file that uses noodels instead of rust_htslib.\nArgument handling is now provided by Clap and output fragments file\nis BGZF-compressed by noodles instead of needing to rely on external\nbgzip compression. The number of BAN deompression and fragments file\ncompression threads are optimised for the fastest runtime with the\nleast amount of CPU overhead.","distinct":true,"url":"https://api.github.com/repos/aertslab/single_cell_toolkit/commits/b13c1941afd8bbd38d99ec30507b1abe54bfcea6"}]},"public":true,"created_at":"2024-03-04T17:06:53Z","org":{"id":3940817,"login":"aertslab","gravatar_id":"","url":"https://api.github.com/orgs/aertslab","avatar_url":"https://avatars.githubusercontent.com/u/3940817?"}},{"id":"36200726350","type":"IssueCommentEvent","actor":{"id":1299177,"login":"ghuls","display_login":"ghuls","gravatar_id":"","url":"https://api.github.com/users/ghuls","avatar_url":"https://avatars.githubusercontent.com/u/1299177?"},"repo":{"id":30698177,"name":"rust-bio/rust-htslib","url":"https://api.github.com/repos/rust-bio/rust-htslib"},"payload":{"action":"created","issue":{"url":"https://api.github.com/repos/rust-bio/rust-htslib/issues/422","repository_url":"https://api.github.com/repos/rust-bio/rust-htslib","labels_url":"https://api.github.com/repos/rust-bio/rust-htslib/issues/422/labels{/name}","comments_url":"https://api.github.com/repos/rust-bio/rust-htslib/issues/422/comments","events_url":"https://api.github.com/repos/rust-bio/rust-htslib/issues/422/events","html_url":"https://github.com/rust-bio/rust-htslib/issues/422","id":2151092945,"node_id":"I_kwDOAdRqwc6ANxLR","number":422,"title":"How to access read group info from `Record`?","user":{"login":"Crispy13","id":47196430,"node_id":"MDQ6VXNlcjQ3MTk2NDMw","avatar_url":"https://avatars.githubusercontent.com/u/47196430?v=4","gravatar_id":"","url":"https://api.github.com/users/Crispy13","html_url":"https://github.com/Crispy13","followers_url":"https://api.github.com/users/Crispy13/followers","following_url":"https://api.github.com/users/Crispy13/following{/other_user}","gists_url":"https://api.github.com/users/Crispy13/gists{/gist_id}","starred_url":"https://api.github.com/users/Crispy13/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/Crispy13/subscriptions","organizations_url":"https://api.github.com/users/Crispy13/orgs","repos_url":"https://api.github.com/users/Crispy13/repos","events_url":"https://api.github.com/users/Crispy13/events{/privacy}","received_events_url":"https://api.github.com/users/Crispy13/received_events","type":"User","site_admin":false},"labels":[],"state":"open","locked":false,"assignee":null,"assignees":[],"milestone":null,"comments":1,"created_at":"2024-02-23T13:26:42Z","updated_at":"2024-03-03T18:18:17Z","closed_at":null,"author_association":"NONE","active_lock_reason":null,"body":"In java, I can access that info like `rec.getReadGroup().getFlowOrder()`.\r\n\r\nHow could I do that in rust-htslib?","reactions":{"url":"https://api.github.com/repos/rust-bio/rust-htslib/issues/422/reactions","total_count":0,"+1":0,"-1":0,"laugh":0,"hooray":0,"confused":0,"heart":0,"rocket":0,"eyes":0},"timeline_url":"https://api.github.com/repos/rust-bio/rust-htslib/issues/422/timeline","performed_via_github_app":null,"state_reason":null},"comment":{"url":"https://api.github.com/repos/rust-bio/rust-htslib/issues/comments/1975251016","html_url":"https://github.com/rust-bio/rust-htslib/issues/422#issuecomment-1975251016","issue_url":"https://api.github.com/repos/rust-bio/rust-htslib/issues/422","id":1975251016,"node_id":"IC_kwDOAdRqwc51u_BI","user":{"login":"ghuls","id":1299177,"node_id":"MDQ6VXNlcjEyOTkxNzc=","avatar_url":"https://avatars.githubusercontent.com/u/1299177?v=4","gravatar_id":"","url":"https://api.github.com/users/ghuls","html_url":"https://github.com/ghuls","followers_url":"https://api.github.com/users/ghuls/followers","following_url":"https://api.github.com/users/ghuls/following{/other_user}","gists_url":"https://api.github.com/users/ghuls/gists{/gist_id}","starred_url":"https://api.github.com/users/ghuls/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/ghuls/subscriptions","organizations_url":"https://api.github.com/users/ghuls/orgs","repos_url":"https://api.github.com/users/ghuls/repos","events_url":"https://api.github.com/users/ghuls/events{/privacy}","received_events_url":"https://api.github.com/users/ghuls/received_events","type":"User","site_admin":false},"created_at":"2024-03-03T18:18:16Z","updated_at":"2024-03-03T18:18:16Z","author_association":"NONE","body":"```rust\r\nif let Ok(Aux::String(rg)) = record.aux(b\"RG\") {\r\n println!(\"{}\", &rg);\r\n}\r\n```","reactions":{"url":"https://api.github.com/repos/rust-bio/rust-htslib/issues/comments/1975251016/reactions","total_count":0,"+1":0,"-1":0,"laugh":0,"hooray":0,"confused":0,"heart":0,"rocket":0,"eyes":0},"performed_via_github_app":null}},"public":true,"created_at":"2024-03-03T18:18:18Z","org":{"id":13785584,"login":"rust-bio","gravatar_id":"","url":"https://api.github.com/orgs/rust-bio","avatar_url":"https://avatars.githubusercontent.com/u/13785584?"}},{"id":"36190052990","type":"PullRequestReviewCommentEvent","actor":{"id":1299177,"login":"ghuls","display_login":"ghuls","gravatar_id":"","url":"https://api.github.com/users/ghuls","avatar_url":"https://avatars.githubusercontent.com/u/1299177?"},"repo":{"id":3261980,"name":"hoytech/vmtouch","url":"https://api.github.com/repos/hoytech/vmtouch"},"payload":{"action":"created","comment":{"url":"https://api.github.com/repos/hoytech/vmtouch/pulls/comments/1510081567","pull_request_review_id":1912746147,"id":1510081567,"node_id":"PRRC_kwDOADHGHM5aAgQf","diff_hunk":"@@ -169,7 +169,7 @@ void usage() {\n printf(\"Usage: vmtouch [OPTIONS] ... FILES OR DIRECTORIES ...\\n\\nOptions:\\n\");\n printf(\" -t touch pages into memory\\n\");\n printf(\" -e evict pages from memory\\n\");\n- printf(\" -l lock pages in physical memory with mlock(2)\\n\");\n+ printf(\" -l lock pages in physical memory with mlock(2) or mlock2(2)\\n\");","path":"vmtouch.c","commit_id":"60fa5b0931b2bfffd7503c2b74704b446f3d523f","original_commit_id":"ec3336173d18248f38c6f30f9e78a0316ebb3274","user":{"login":"ghuls","id":1299177,"node_id":"MDQ6VXNlcjEyOTkxNzc=","avatar_url":"https://avatars.githubusercontent.com/u/1299177?v=4","gravatar_id":"","url":"https://api.github.com/users/ghuls","html_url":"https://github.com/ghuls","followers_url":"https://api.github.com/users/ghuls/followers","following_url":"https://api.github.com/users/ghuls/following{/other_user}","gists_url":"https://api.github.com/users/ghuls/gists{/gist_id}","starred_url":"https://api.github.com/users/ghuls/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/ghuls/subscriptions","organizations_url":"https://api.github.com/users/ghuls/orgs","repos_url":"https://api.github.com/users/ghuls/repos","events_url":"https://api.github.com/users/ghuls/events{/privacy}","received_events_url":"https://api.github.com/users/ghuls/received_events","type":"User","site_admin":false},"body":"Added.","created_at":"2024-03-02T21:46:13Z","updated_at":"2024-03-02T21:46:13Z","html_url":"https://github.com/hoytech/vmtouch/pull/98#discussion_r1510081567","pull_request_url":"https://api.github.com/repos/hoytech/vmtouch/pulls/98","author_association":"NONE","_links":{"self":{"href":"https://api.github.com/repos/hoytech/vmtouch/pulls/comments/1510081567"},"html":{"href":"https://github.com/hoytech/vmtouch/pull/98#discussion_r1510081567"},"pull_request":{"href":"https://api.github.com/repos/hoytech/vmtouch/pulls/98"}},"reactions":{"url":"https://api.github.com/repos/hoytech/vmtouch/pulls/comments/1510081567/reactions","total_count":0,"+1":0,"-1":0,"laugh":0,"hooray":0,"confused":0,"heart":0,"rocket":0,"eyes":0},"start_line":null,"original_start_line":null,"start_side":null,"line":null,"original_line":172,"side":"RIGHT","in_reply_to_id":1509968186,"original_position":5,"position":null,"subject_type":"line"},"pull_request":{"url":"https://api.github.com/repos/hoytech/vmtouch/pulls/98","id":927564365,"node_id":"PR_kwDOADHGHM43SYJN","html_url":"https://github.com/hoytech/vmtouch/pull/98","diff_url":"https://github.com/hoytech/vmtouch/pull/98.diff","patch_url":"https://github.com/hoytech/vmtouch/pull/98.patch","issue_url":"https://api.github.com/repos/hoytech/vmtouch/issues/98","number":98,"state":"open","locked":false,"title":"Use mlock2() instead of mlock() on Linux and use mlockall(MCL_FUTURE).","user":{"login":"ghuls","id":1299177,"node_id":"MDQ6VXNlcjEyOTkxNzc=","avatar_url":"https://avatars.githubusercontent.com/u/1299177?v=4","gravatar_id":"","url":"https://api.github.com/users/ghuls","html_url":"https://github.com/ghuls","followers_url":"https://api.github.com/users/ghuls/followers","following_url":"https://api.github.com/users/ghuls/following{/other_user}","gists_url":"https://api.github.com/users/ghuls/gists{/gist_id}","starred_url":"https://api.github.com/users/ghuls/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/ghuls/subscriptions","organizations_url":"https://api.github.com/users/ghuls/orgs","repos_url":"https://api.github.com/users/ghuls/repos","events_url":"https://api.github.com/users/ghuls/events{/privacy}","received_events_url":"https://api.github.com/users/ghuls/received_events","type":"User","site_admin":false},"body":"Use mlock2() instead of mlock() on Linux:\r\n - mlock2() allows a third argument MLOCK_ONFAULT, which allow to\r\n lock the pages in a range before touching them, so mlock2()\r\n can be called before touching all pages instead of after.\r\n - \"vmtouch -t -l\" some_file would not keep the pages in cache\r\n after touching them, so at the lock() step they needed to\r\n be read again from disk. After this patch files are cached\r\n twice as fast on this system.\r\n\r\nUse mlockall(MCL_FUTURE) instead of mlockall(MCL_CURRENT):\r\n - Set mlockall(MCL_FUTURE) before touching pages to have similar\r\n behaviour as the mlock2() approach above.","created_at":"2022-05-04T12:27:58Z","updated_at":"2024-03-02T21:46:13Z","closed_at":null,"merged_at":null,"merge_commit_sha":"d05fc1c4afd12e1d8608b0b432bbe2f6baaeba81","assignee":null,"assignees":[],"requested_reviewers":[],"requested_teams":[],"labels":[],"milestone":null,"draft":false,"commits_url":"https://api.github.com/repos/hoytech/vmtouch/pulls/98/commits","review_comments_url":"https://api.github.com/repos/hoytech/vmtouch/pulls/98/comments","review_comment_url":"https://api.github.com/repos/hoytech/vmtouch/pulls/comments{/number}","comments_url":"https://api.github.com/repos/hoytech/vmtouch/issues/98/comments","statuses_url":"https://api.github.com/repos/hoytech/vmtouch/statuses/60fa5b0931b2bfffd7503c2b74704b446f3d523f","head":{"label":"ghuls:mlock2","ref":"mlock2","sha":"60fa5b0931b2bfffd7503c2b74704b446f3d523f","user":{"login":"ghuls","id":1299177,"node_id":"MDQ6VXNlcjEyOTkxNzc=","avatar_url":"https://avatars.githubusercontent.com/u/1299177?v=4","gravatar_id":"","url":"https://api.github.com/users/ghuls","html_url":"https://github.com/ghuls","followers_url":"https://api.github.com/users/ghuls/followers","following_url":"https://api.github.com/users/ghuls/following{/other_user}","gists_url":"https://api.github.com/users/ghuls/gists{/gist_id}","starred_url":"https://api.github.com/users/ghuls/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/ghuls/subscriptions","organizations_url":"https://api.github.com/users/ghuls/orgs","repos_url":"https://api.github.com/users/ghuls/repos","events_url":"https://api.github.com/users/ghuls/events{/privacy}","received_events_url":"https://api.github.com/users/ghuls/received_events","type":"User","site_admin":false},"repo":{"id":488579310,"node_id":"R_kgDOHR8g7g","name":"vmtouch","full_name":"ghuls/vmtouch","private":false,"owner":{"login":"ghuls","id":1299177,"node_id":"MDQ6VXNlcjEyOTkxNzc=","avatar_url":"https://avatars.githubusercontent.com/u/1299177?v=4","gravatar_id":"","url":"https://api.github.com/users/ghuls","html_url":"https://github.com/ghuls","followers_url":"https://api.github.com/users/ghuls/followers","following_url":"https://api.github.com/users/ghuls/following{/other_user}","gists_url":"https://api.github.com/users/ghuls/gists{/gist_id}","starred_url":"https://api.github.com/users/ghuls/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/ghuls/subscriptions","organizations_url":"https://api.github.com/users/ghuls/orgs","repos_url":"https://api.github.com/users/ghuls/repos","events_url":"https://api.github.com/users/ghuls/events{/privacy}","received_events_url":"https://api.github.com/users/ghuls/received_events","type":"User","site_admin":false},"html_url":"https://github.com/ghuls/vmtouch","description":"Portable file system cache diagnostics and control","fork":true,"url":"https://api.github.com/repos/ghuls/vmtouch","forks_url":"https://api.github.com/repos/ghuls/vmtouch/forks","keys_url":"https://api.github.com/repos/ghuls/vmtouch/keys{/key_id}","collaborators_url":"https://api.github.com/repos/ghuls/vmtouch/collaborators{/collaborator}","teams_url":"https://api.github.com/repos/ghuls/vmtouch/teams","hooks_url":"https://api.github.com/repos/ghuls/vmtouch/hooks","issue_events_url":"https://api.github.com/repos/ghuls/vmtouch/issues/events{/number}","events_url":"https://api.github.com/repos/ghuls/vmtouch/events","assignees_url":"https://api.github.com/repos/ghuls/vmtouch/assignees{/user}","branches_url":"https://api.github.com/repos/ghuls/vmtouch/branches{/branch}","tags_url":"https://api.github.com/repos/ghuls/vmtouch/tags","blobs_url":"https://api.github.com/repos/ghuls/vmtouch/git/blobs{/sha}","git_tags_url":"https://api.github.com/repos/ghuls/vmtouch/git/tags{/sha}","git_refs_url":"https://api.github.com/repos/ghuls/vmtouch/git/refs{/sha}","trees_url":"https://api.github.com/repos/ghuls/vmtouch/git/trees{/sha}","statuses_url":"https://api.github.com/repos/ghuls/vmtouch/statuses/{sha}","languages_url":"https://api.github.com/repos/ghuls/vmtouch/languages","stargazers_url":"https://api.github.com/repos/ghuls/vmtouch/stargazers","contributors_url":"https://api.github.com/repos/ghuls/vmtouch/contributors","subscribers_url":"https://api.github.com/repos/ghuls/vmtouch/subscribers","subscription_url":"https://api.github.com/repos/ghuls/vmtouch/subscription","commits_url":"https://api.github.com/repos/ghuls/vmtouch/commits{/sha}","git_commits_url":"https://api.github.com/repos/ghuls/vmtouch/git/commits{/sha}","comments_url":"https://api.github.com/repos/ghuls/vmtouch/comments{/number}","issue_comment_url":"https://api.github.com/repos/ghuls/vmtouch/issues/comments{/number}","contents_url":"https://api.github.com/repos/ghuls/vmtouch/contents/{+path}","compare_url":"https://api.github.com/repos/ghuls/vmtouch/compare/{base}...{head}","merges_url":"https://api.github.com/repos/ghuls/vmtouch/merges","archive_url":"https://api.github.com/repos/ghuls/vmtouch/{archive_format}{/ref}","downloads_url":"https://api.github.com/repos/ghuls/vmtouch/downloads","issues_url":"https://api.github.com/repos/ghuls/vmtouch/issues{/number}","pulls_url":"https://api.github.com/repos/ghuls/vmtouch/pulls{/number}","milestones_url":"https://api.github.com/repos/ghuls/vmtouch/milestones{/number}","notifications_url":"https://api.github.com/repos/ghuls/vmtouch/notifications{?since,all,participating}","labels_url":"https://api.github.com/repos/ghuls/vmtouch/labels{/name}","releases_url":"https://api.github.com/repos/ghuls/vmtouch/releases{/id}","deployments_url":"https://api.github.com/repos/ghuls/vmtouch/deployments","created_at":"2022-05-04T12:27:40Z","updated_at":"2022-05-01T15:14:00Z","pushed_at":"2024-03-02T21:44:40Z","git_url":"git://github.com/ghuls/vmtouch.git","ssh_url":"git@github.com:ghuls/vmtouch.git","clone_url":"https://github.com/ghuls/vmtouch.git","svn_url":"https://github.com/ghuls/vmtouch","homepage":"https://hoytech.com/vmtouch/","size":367,"stargazers_count":0,"watchers_count":0,"language":null,"has_issues":false,"has_projects":true,"has_downloads":true,"has_wiki":true,"has_pages":false,"has_discussions":false,"forks_count":0,"mirror_url":null,"archived":false,"disabled":false,"open_issues_count":0,"license":{"key":"other","name":"Other","spdx_id":"NOASSERTION","url":null,"node_id":"MDc6TGljZW5zZTA="},"allow_forking":true,"is_template":false,"web_commit_signoff_required":false,"topics":[],"visibility":"public","forks":0,"open_issues":0,"watchers":0,"default_branch":"master"}},"base":{"label":"hoytech:master","ref":"master","sha":"af86e27675843b3c7e4ddfee66ddbaf44eff43c4","user":{"login":"hoytech","id":144548,"node_id":"MDQ6VXNlcjE0NDU0OA==","avatar_url":"https://avatars.githubusercontent.com/u/144548?v=4","gravatar_id":"","url":"https://api.github.com/users/hoytech","html_url":"https://github.com/hoytech","followers_url":"https://api.github.com/users/hoytech/followers","following_url":"https://api.github.com/users/hoytech/following{/other_user}","gists_url":"https://api.github.com/users/hoytech/gists{/gist_id}","starred_url":"https://api.github.com/users/hoytech/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/hoytech/subscriptions","organizations_url":"https://api.github.com/users/hoytech/orgs","repos_url":"https://api.github.com/users/hoytech/repos","events_url":"https://api.github.com/users/hoytech/events{/privacy}","received_events_url":"https://api.github.com/users/hoytech/received_events","type":"User","site_admin":false},"repo":{"id":3261980,"node_id":"MDEwOlJlcG9zaXRvcnkzMjYxOTgw","name":"vmtouch","full_name":"hoytech/vmtouch","private":false,"owner":{"login":"hoytech","id":144548,"node_id":"MDQ6VXNlcjE0NDU0OA==","avatar_url":"https://avatars.githubusercontent.com/u/144548?v=4","gravatar_id":"","url":"https://api.github.com/users/hoytech","html_url":"https://github.com/hoytech","followers_url":"https://api.github.com/users/hoytech/followers","following_url":"https://api.github.com/users/hoytech/following{/other_user}","gists_url":"https://api.github.com/users/hoytech/gists{/gist_id}","starred_url":"https://api.github.com/users/hoytech/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/hoytech/subscriptions","organizations_url":"https://api.github.com/users/hoytech/orgs","repos_url":"https://api.github.com/users/hoytech/repos","events_url":"https://api.github.com/users/hoytech/events{/privacy}","received_events_url":"https://api.github.com/users/hoytech/received_events","type":"User","site_admin":false},"html_url":"https://github.com/hoytech/vmtouch","description":"Portable file system cache diagnostics and control","fork":false,"url":"https://api.github.com/repos/hoytech/vmtouch","forks_url":"https://api.github.com/repos/hoytech/vmtouch/forks","keys_url":"https://api.github.com/repos/hoytech/vmtouch/keys{/key_id}","collaborators_url":"https://api.github.com/repos/hoytech/vmtouch/collaborators{/collaborator}","teams_url":"https://api.github.com/repos/hoytech/vmtouch/teams","hooks_url":"https://api.github.com/repos/hoytech/vmtouch/hooks","issue_events_url":"https://api.github.com/repos/hoytech/vmtouch/issues/events{/number}","events_url":"https://api.github.com/repos/hoytech/vmtouch/events","assignees_url":"https://api.github.com/repos/hoytech/vmtouch/assignees{/user}","branches_url":"https://api.github.com/repos/hoytech/vmtouch/branches{/branch}","tags_url":"https://api.github.com/repos/hoytech/vmtouch/tags","blobs_url":"https://api.github.com/repos/hoytech/vmtouch/git/blobs{/sha}","git_tags_url":"https://api.github.com/repos/hoytech/vmtouch/git/tags{/sha}","git_refs_url":"https://api.github.com/repos/hoytech/vmtouch/git/refs{/sha}","trees_url":"https://api.github.com/repos/hoytech/vmtouch/git/trees{/sha}","statuses_url":"https://api.github.com/repos/hoytech/vmtouch/statuses/{sha}","languages_url":"https://api.github.com/repos/hoytech/vmtouch/languages","stargazers_url":"https://api.github.com/repos/hoytech/vmtouch/stargazers","contributors_url":"https://api.github.com/repos/hoytech/vmtouch/contributors","subscribers_url":"https://api.github.com/repos/hoytech/vmtouch/subscribers","subscription_url":"https://api.github.com/repos/hoytech/vmtouch/subscription","commits_url":"https://api.github.com/repos/hoytech/vmtouch/commits{/sha}","git_commits_url":"https://api.github.com/repos/hoytech/vmtouch/git/commits{/sha}","comments_url":"https://api.github.com/repos/hoytech/vmtouch/comments{/number}","issue_comment_url":"https://api.github.com/repos/hoytech/vmtouch/issues/comments{/number}","contents_url":"https://api.github.com/repos/hoytech/vmtouch/contents/{+path}","compare_url":"https://api.github.com/repos/hoytech/vmtouch/compare/{base}...{head}","merges_url":"https://api.github.com/repos/hoytech/vmtouch/merges","archive_url":"https://api.github.com/repos/hoytech/vmtouch/{archive_format}{/ref}","downloads_url":"https://api.github.com/repos/hoytech/vmtouch/downloads","issues_url":"https://api.github.com/repos/hoytech/vmtouch/issues{/number}","pulls_url":"https://api.github.com/repos/hoytech/vmtouch/pulls{/number}","milestones_url":"https://api.github.com/repos/hoytech/vmtouch/milestones{/number}","notifications_url":"https://api.github.com/repos/hoytech/vmtouch/notifications{?since,all,participating}","labels_url":"https://api.github.com/repos/hoytech/vmtouch/labels{/name}","releases_url":"https://api.github.com/repos/hoytech/vmtouch/releases{/id}","deployments_url":"https://api.github.com/repos/hoytech/vmtouch/deployments","created_at":"2012-01-25T02:45:32Z","updated_at":"2024-02-29T10:02:55Z","pushed_at":"2024-03-02T21:44:42Z","git_url":"git://github.com/hoytech/vmtouch.git","ssh_url":"git@github.com:hoytech/vmtouch.git","clone_url":"https://github.com/hoytech/vmtouch.git","svn_url":"https://github.com/hoytech/vmtouch","homepage":"https://hoytech.com/vmtouch/","size":367,"stargazers_count":1730,"watchers_count":1730,"language":"C","has_issues":true,"has_projects":true,"has_downloads":true,"has_wiki":true,"has_pages":false,"has_discussions":false,"forks_count":207,"mirror_url":null,"archived":false,"disabled":false,"open_issues_count":48,"license":{"key":"bsd-3-clause","name":"BSD 3-Clause \"New\" or \"Revised\" License","spdx_id":"BSD-3-Clause","url":"https://api.github.com/licenses/bsd-3-clause","node_id":"MDc6TGljZW5zZTU="},"allow_forking":true,"is_template":false,"web_commit_signoff_required":false,"topics":["cache","evict","filesystem-cache","lock-memory","mlock","page","pages","paging","touch","virtual-memory"],"visibility":"public","forks":207,"open_issues":48,"watchers":1730,"default_branch":"master"}},"_links":{"self":{"href":"https://api.github.com/repos/hoytech/vmtouch/pulls/98"},"html":{"href":"https://github.com/hoytech/vmtouch/pull/98"},"issue":{"href":"https://api.github.com/repos/hoytech/vmtouch/issues/98"},"comments":{"href":"https://api.github.com/repos/hoytech/vmtouch/issues/98/comments"},"review_comments":{"href":"https://api.github.com/repos/hoytech/vmtouch/pulls/98/comments"},"review_comment":{"href":"https://api.github.com/repos/hoytech/vmtouch/pulls/comments{/number}"},"commits":{"href":"https://api.github.com/repos/hoytech/vmtouch/pulls/98/commits"},"statuses":{"href":"https://api.github.com/repos/hoytech/vmtouch/statuses/60fa5b0931b2bfffd7503c2b74704b446f3d523f"}},"author_association":"NONE","auto_merge":null,"active_lock_reason":null}},"public":true,"created_at":"2024-03-02T21:46:13Z"},{"id":"36190052980","type":"PullRequestReviewEvent","actor":{"id":1299177,"login":"ghuls","display_login":"ghuls","gravatar_id":"","url":"https://api.github.com/users/ghuls","avatar_url":"https://avatars.githubusercontent.com/u/1299177?"},"repo":{"id":3261980,"name":"hoytech/vmtouch","url":"https://api.github.com/repos/hoytech/vmtouch"},"payload":{"action":"created","review":{"id":1912746147,"node_id":"PRR_kwDOADHGHM5yAjCj","user":{"login":"ghuls","id":1299177,"node_id":"MDQ6VXNlcjEyOTkxNzc=","avatar_url":"https://avatars.githubusercontent.com/u/1299177?v=4","gravatar_id":"","url":"https://api.github.com/users/ghuls","html_url":"https://github.com/ghuls","followers_url":"https://api.github.com/users/ghuls/followers","following_url":"https://api.github.com/users/ghuls/following{/other_user}","gists_url":"https://api.github.com/users/ghuls/gists{/gist_id}","starred_url":"https://api.github.com/users/ghuls/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/ghuls/subscriptions","organizations_url":"https://api.github.com/users/ghuls/orgs","repos_url":"https://api.github.com/users/ghuls/repos","events_url":"https://api.github.com/users/ghuls/events{/privacy}","received_events_url":"https://api.github.com/users/ghuls/received_events","type":"User","site_admin":false},"body":null,"commit_id":"ec3336173d18248f38c6f30f9e78a0316ebb3274","submitted_at":"2024-03-02T21:46:13Z","state":"commented","html_url":"https://github.com/hoytech/vmtouch/pull/98#pullrequestreview-1912746147","pull_request_url":"https://api.github.com/repos/hoytech/vmtouch/pulls/98","author_association":"NONE","_links":{"html":{"href":"https://github.com/hoytech/vmtouch/pull/98#pullrequestreview-1912746147"},"pull_request":{"href":"https://api.github.com/repos/hoytech/vmtouch/pulls/98"}}},"pull_request":{"url":"https://api.github.com/repos/hoytech/vmtouch/pulls/98","id":927564365,"node_id":"PR_kwDOADHGHM43SYJN","html_url":"https://github.com/hoytech/vmtouch/pull/98","diff_url":"https://github.com/hoytech/vmtouch/pull/98.diff","patch_url":"https://github.com/hoytech/vmtouch/pull/98.patch","issue_url":"https://api.github.com/repos/hoytech/vmtouch/issues/98","number":98,"state":"open","locked":false,"title":"Use mlock2() instead of mlock() on Linux and use mlockall(MCL_FUTURE).","user":{"login":"ghuls","id":1299177,"node_id":"MDQ6VXNlcjEyOTkxNzc=","avatar_url":"https://avatars.githubusercontent.com/u/1299177?v=4","gravatar_id":"","url":"https://api.github.com/users/ghuls","html_url":"https://github.com/ghuls","followers_url":"https://api.github.com/users/ghuls/followers","following_url":"https://api.github.com/users/ghuls/following{/other_user}","gists_url":"https://api.github.com/users/ghuls/gists{/gist_id}","starred_url":"https://api.github.com/users/ghuls/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/ghuls/subscriptions","organizations_url":"https://api.github.com/users/ghuls/orgs","repos_url":"https://api.github.com/users/ghuls/repos","events_url":"https://api.github.com/users/ghuls/events{/privacy}","received_events_url":"https://api.github.com/users/ghuls/received_events","type":"User","site_admin":false},"body":"Use mlock2() instead of mlock() on Linux:\r\n - mlock2() allows a third argument MLOCK_ONFAULT, which allow to\r\n lock the pages in a range before touching them, so mlock2()\r\n can be called before touching all pages instead of after.\r\n - \"vmtouch -t -l\" some_file would not keep the pages in cache\r\n after touching them, so at the lock() step they needed to\r\n be read again from disk. After this patch files are cached\r\n twice as fast on this system.\r\n\r\nUse mlockall(MCL_FUTURE) instead of mlockall(MCL_CURRENT):\r\n - Set mlockall(MCL_FUTURE) before touching pages to have similar\r\n behaviour as the mlock2() approach above.","created_at":"2022-05-04T12:27:58Z","updated_at":"2024-03-02T21:46:13Z","closed_at":null,"merged_at":null,"merge_commit_sha":"d05fc1c4afd12e1d8608b0b432bbe2f6baaeba81","assignee":null,"assignees":[],"requested_reviewers":[],"requested_teams":[],"labels":[],"milestone":null,"draft":false,"commits_url":"https://api.github.com/repos/hoytech/vmtouch/pulls/98/commits","review_comments_url":"https://api.github.com/repos/hoytech/vmtouch/pulls/98/comments","review_comment_url":"https://api.github.com/repos/hoytech/vmtouch/pulls/comments{/number}","comments_url":"https://api.github.com/repos/hoytech/vmtouch/issues/98/comments","statuses_url":"https://api.github.com/repos/hoytech/vmtouch/statuses/60fa5b0931b2bfffd7503c2b74704b446f3d523f","head":{"label":"ghuls:mlock2","ref":"mlock2","sha":"60fa5b0931b2bfffd7503c2b74704b446f3d523f","user":{"login":"ghuls","id":1299177,"node_id":"MDQ6VXNlcjEyOTkxNzc=","avatar_url":"https://avatars.githubusercontent.com/u/1299177?v=4","gravatar_id":"","url":"https://api.github.com/users/ghuls","html_url":"https://github.com/ghuls","followers_url":"https://api.github.com/users/ghuls/followers","following_url":"https://api.github.com/users/ghuls/following{/other_user}","gists_url":"https://api.github.com/users/ghuls/gists{/gist_id}","starred_url":"https://api.github.com/users/ghuls/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/ghuls/subscriptions","organizations_url":"https://api.github.com/users/ghuls/orgs","repos_url":"https://api.github.com/users/ghuls/repos","events_url":"https://api.github.com/users/ghuls/events{/privacy}","received_events_url":"https://api.github.com/users/ghuls/received_events","type":"User","site_admin":false},"repo":{"id":488579310,"node_id":"R_kgDOHR8g7g","name":"vmtouch","full_name":"ghuls/vmtouch","private":false,"owner":{"login":"ghuls","id":1299177,"node_id":"MDQ6VXNlcjEyOTkxNzc=","avatar_url":"https://avatars.githubusercontent.com/u/1299177?v=4","gravatar_id":"","url":"https://api.github.com/users/ghuls","html_url":"https://github.com/ghuls","followers_url":"https://api.github.com/users/ghuls/followers","following_url":"https://api.github.com/users/ghuls/following{/other_user}","gists_url":"https://api.github.com/users/ghuls/gists{/gist_id}","starred_url":"https://api.github.com/users/ghuls/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/ghuls/subscriptions","organizations_url":"https://api.github.com/users/ghuls/orgs","repos_url":"https://api.github.com/users/ghuls/repos","events_url":"https://api.github.com/users/ghuls/events{/privacy}","received_events_url":"https://api.github.com/users/ghuls/received_events","type":"User","site_admin":false},"html_url":"https://github.com/ghuls/vmtouch","description":"Portable file system cache diagnostics and control","fork":true,"url":"https://api.github.com/repos/ghuls/vmtouch","forks_url":"https://api.github.com/repos/ghuls/vmtouch/forks","keys_url":"https://api.github.com/repos/ghuls/vmtouch/keys{/key_id}","collaborators_url":"https://api.github.com/repos/ghuls/vmtouch/collaborators{/collaborator}","teams_url":"https://api.github.com/repos/ghuls/vmtouch/teams","hooks_url":"https://api.github.com/repos/ghuls/vmtouch/hooks","issue_events_url":"https://api.github.com/repos/ghuls/vmtouch/issues/events{/number}","events_url":"https://api.github.com/repos/ghuls/vmtouch/events","assignees_url":"https://api.github.com/repos/ghuls/vmtouch/assignees{/user}","branches_url":"https://api.github.com/repos/ghuls/vmtouch/branches{/branch}","tags_url":"https://api.github.com/repos/ghuls/vmtouch/tags","blobs_url":"https://api.github.com/repos/ghuls/vmtouch/git/blobs{/sha}","git_tags_url":"https://api.github.com/repos/ghuls/vmtouch/git/tags{/sha}","git_refs_url":"https://api.github.com/repos/ghuls/vmtouch/git/refs{/sha}","trees_url":"https://api.github.com/repos/ghuls/vmtouch/git/trees{/sha}","statuses_url":"https://api.github.com/repos/ghuls/vmtouch/statuses/{sha}","languages_url":"https://api.github.com/repos/ghuls/vmtouch/languages","stargazers_url":"https://api.github.com/repos/ghuls/vmtouch/stargazers","contributors_url":"https://api.github.com/repos/ghuls/vmtouch/contributors","subscribers_url":"https://api.github.com/repos/ghuls/vmtouch/subscribers","subscription_url":"https://api.github.com/repos/ghuls/vmtouch/subscription","commits_url":"https://api.github.com/repos/ghuls/vmtouch/commits{/sha}","git_commits_url":"https://api.github.com/repos/ghuls/vmtouch/git/commits{/sha}","comments_url":"https://api.github.com/repos/ghuls/vmtouch/comments{/number}","issue_comment_url":"https://api.github.com/repos/ghuls/vmtouch/issues/comments{/number}","contents_url":"https://api.github.com/repos/ghuls/vmtouch/contents/{+path}","compare_url":"https://api.github.com/repos/ghuls/vmtouch/compare/{base}...{head}","merges_url":"https://api.github.com/repos/ghuls/vmtouch/merges","archive_url":"https://api.github.com/repos/ghuls/vmtouch/{archive_format}{/ref}","downloads_url":"https://api.github.com/repos/ghuls/vmtouch/downloads","issues_url":"https://api.github.com/repos/ghuls/vmtouch/issues{/number}","pulls_url":"https://api.github.com/repos/ghuls/vmtouch/pulls{/number}","milestones_url":"https://api.github.com/repos/ghuls/vmtouch/milestones{/number}","notifications_url":"https://api.github.com/repos/ghuls/vmtouch/notifications{?since,all,participating}","labels_url":"https://api.github.com/repos/ghuls/vmtouch/labels{/name}","releases_url":"https://api.github.com/repos/ghuls/vmtouch/releases{/id}","deployments_url":"https://api.github.com/repos/ghuls/vmtouch/deployments","created_at":"2022-05-04T12:27:40Z","updated_at":"2022-05-01T15:14:00Z","pushed_at":"2024-03-02T21:44:40Z","git_url":"git://github.com/ghuls/vmtouch.git","ssh_url":"git@github.com:ghuls/vmtouch.git","clone_url":"https://github.com/ghuls/vmtouch.git","svn_url":"https://github.com/ghuls/vmtouch","homepage":"https://hoytech.com/vmtouch/","size":367,"stargazers_count":0,"watchers_count":0,"language":null,"has_issues":false,"has_projects":true,"has_downloads":true,"has_wiki":true,"has_pages":false,"has_discussions":false,"forks_count":0,"mirror_url":null,"archived":false,"disabled":false,"open_issues_count":0,"license":{"key":"other","name":"Other","spdx_id":"NOASSERTION","url":null,"node_id":"MDc6TGljZW5zZTA="},"allow_forking":true,"is_template":false,"web_commit_signoff_required":false,"topics":[],"visibility":"public","forks":0,"open_issues":0,"watchers":0,"default_branch":"master"}},"base":{"label":"hoytech:master","ref":"master","sha":"af86e27675843b3c7e4ddfee66ddbaf44eff43c4","user":{"login":"hoytech","id":144548,"node_id":"MDQ6VXNlcjE0NDU0OA==","avatar_url":"https://avatars.githubusercontent.com/u/144548?v=4","gravatar_id":"","url":"https://api.github.com/users/hoytech","html_url":"https://github.com/hoytech","followers_url":"https://api.github.com/users/hoytech/followers","following_url":"https://api.github.com/users/hoytech/following{/other_user}","gists_url":"https://api.github.com/users/hoytech/gists{/gist_id}","starred_url":"https://api.github.com/users/hoytech/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/hoytech/subscriptions","organizations_url":"https://api.github.com/users/hoytech/orgs","repos_url":"https://api.github.com/users/hoytech/repos","events_url":"https://api.github.com/users/hoytech/events{/privacy}","received_events_url":"https://api.github.com/users/hoytech/received_events","type":"User","site_admin":false},"repo":{"id":3261980,"node_id":"MDEwOlJlcG9zaXRvcnkzMjYxOTgw","name":"vmtouch","full_name":"hoytech/vmtouch","private":false,"owner":{"login":"hoytech","id":144548,"node_id":"MDQ6VXNlcjE0NDU0OA==","avatar_url":"https://avatars.githubusercontent.com/u/144548?v=4","gravatar_id":"","url":"https://api.github.com/users/hoytech","html_url":"https://github.com/hoytech","followers_url":"https://api.github.com/users/hoytech/followers","following_url":"https://api.github.com/users/hoytech/following{/other_user}","gists_url":"https://api.github.com/users/hoytech/gists{/gist_id}","starred_url":"https://api.github.com/users/hoytech/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/hoytech/subscriptions","organizations_url":"https://api.github.com/users/hoytech/orgs","repos_url":"https://api.github.com/users/hoytech/repos","events_url":"https://api.github.com/users/hoytech/events{/privacy}","received_events_url":"https://api.github.com/users/hoytech/received_events","type":"User","site_admin":false},"html_url":"https://github.com/hoytech/vmtouch","description":"Portable file system cache diagnostics and control","fork":false,"url":"https://api.github.com/repos/hoytech/vmtouch","forks_url":"https://api.github.com/repos/hoytech/vmtouch/forks","keys_url":"https://api.github.com/repos/hoytech/vmtouch/keys{/key_id}","collaborators_url":"https://api.github.com/repos/hoytech/vmtouch/collaborators{/collaborator}","teams_url":"https://api.github.com/repos/hoytech/vmtouch/teams","hooks_url":"https://api.github.com/repos/hoytech/vmtouch/hooks","issue_events_url":"https://api.github.com/repos/hoytech/vmtouch/issues/events{/number}","events_url":"https://api.github.com/repos/hoytech/vmtouch/events","assignees_url":"https://api.github.com/repos/hoytech/vmtouch/assignees{/user}","branches_url":"https://api.github.com/repos/hoytech/vmtouch/branches{/branch}","tags_url":"https://api.github.com/repos/hoytech/vmtouch/tags","blobs_url":"https://api.github.com/repos/hoytech/vmtouch/git/blobs{/sha}","git_tags_url":"https://api.github.com/repos/hoytech/vmtouch/git/tags{/sha}","git_refs_url":"https://api.github.com/repos/hoytech/vmtouch/git/refs{/sha}","trees_url":"https://api.github.com/repos/hoytech/vmtouch/git/trees{/sha}","statuses_url":"https://api.github.com/repos/hoytech/vmtouch/statuses/{sha}","languages_url":"https://api.github.com/repos/hoytech/vmtouch/languages","stargazers_url":"https://api.github.com/repos/hoytech/vmtouch/stargazers","contributors_url":"https://api.github.com/repos/hoytech/vmtouch/contributors","subscribers_url":"https://api.github.com/repos/hoytech/vmtouch/subscribers","subscription_url":"https://api.github.com/repos/hoytech/vmtouch/subscription","commits_url":"https://api.github.com/repos/hoytech/vmtouch/commits{/sha}","git_commits_url":"https://api.github.com/repos/hoytech/vmtouch/git/commits{/sha}","comments_url":"https://api.github.com/repos/hoytech/vmtouch/comments{/number}","issue_comment_url":"https://api.github.com/repos/hoytech/vmtouch/issues/comments{/number}","contents_url":"https://api.github.com/repos/hoytech/vmtouch/contents/{+path}","compare_url":"https://api.github.com/repos/hoytech/vmtouch/compare/{base}...{head}","merges_url":"https://api.github.com/repos/hoytech/vmtouch/merges","archive_url":"https://api.github.com/repos/hoytech/vmtouch/{archive_format}{/ref}","downloads_url":"https://api.github.com/repos/hoytech/vmtouch/downloads","issues_url":"https://api.github.com/repos/hoytech/vmtouch/issues{/number}","pulls_url":"https://api.github.com/repos/hoytech/vmtouch/pulls{/number}","milestones_url":"https://api.github.com/repos/hoytech/vmtouch/milestones{/number}","notifications_url":"https://api.github.com/repos/hoytech/vmtouch/notifications{?since,all,participating}","labels_url":"https://api.github.com/repos/hoytech/vmtouch/labels{/name}","releases_url":"https://api.github.com/repos/hoytech/vmtouch/releases{/id}","deployments_url":"https://api.github.com/repos/hoytech/vmtouch/deployments","created_at":"2012-01-25T02:45:32Z","updated_at":"2024-02-29T10:02:55Z","pushed_at":"2024-03-02T21:44:42Z","git_url":"git://github.com/hoytech/vmtouch.git","ssh_url":"git@github.com:hoytech/vmtouch.git","clone_url":"https://github.com/hoytech/vmtouch.git","svn_url":"https://github.com/hoytech/vmtouch","homepage":"https://hoytech.com/vmtouch/","size":367,"stargazers_count":1730,"watchers_count":1730,"language":"C","has_issues":true,"has_projects":true,"has_downloads":true,"has_wiki":true,"has_pages":false,"has_discussions":false,"forks_count":207,"mirror_url":null,"archived":false,"disabled":false,"open_issues_count":48,"license":{"key":"bsd-3-clause","name":"BSD 3-Clause \"New\" or \"Revised\" License","spdx_id":"BSD-3-Clause","url":"https://api.github.com/licenses/bsd-3-clause","node_id":"MDc6TGljZW5zZTU="},"allow_forking":true,"is_template":false,"web_commit_signoff_required":false,"topics":["cache","evict","filesystem-cache","lock-memory","mlock","page","pages","paging","touch","virtual-memory"],"visibility":"public","forks":207,"open_issues":48,"watchers":1730,"default_branch":"master"}},"_links":{"self":{"href":"https://api.github.com/repos/hoytech/vmtouch/pulls/98"},"html":{"href":"https://github.com/hoytech/vmtouch/pull/98"},"issue":{"href":"https://api.github.com/repos/hoytech/vmtouch/issues/98"},"comments":{"href":"https://api.github.com/repos/hoytech/vmtouch/issues/98/comments"},"review_comments":{"href":"https://api.github.com/repos/hoytech/vmtouch/pulls/98/comments"},"review_comment":{"href":"https://api.github.com/repos/hoytech/vmtouch/pulls/comments{/number}"},"commits":{"href":"https://api.github.com/repos/hoytech/vmtouch/pulls/98/commits"},"statuses":{"href":"https://api.github.com/repos/hoytech/vmtouch/statuses/60fa5b0931b2bfffd7503c2b74704b446f3d523f"}},"author_association":"NONE","auto_merge":null,"active_lock_reason":null}},"public":true,"created_at":"2024-03-02T21:46:14Z"},{"id":"36190041588","type":"PushEvent","actor":{"id":1299177,"login":"ghuls","display_login":"ghuls","gravatar_id":"","url":"https://api.github.com/users/ghuls","avatar_url":"https://avatars.githubusercontent.com/u/1299177?"},"repo":{"id":488579310,"name":"ghuls/vmtouch","url":"https://api.github.com/repos/ghuls/vmtouch"},"payload":{"repository_id":488579310,"push_id":17359269321,"size":1,"distinct_size":1,"ref":"refs/heads/mlock2","head":"60fa5b0931b2bfffd7503c2b74704b446f3d523f","before":"ec3336173d18248f38c6f30f9e78a0316ebb3274","commits":[{"sha":"60fa5b0931b2bfffd7503c2b74704b446f3d523f","author":{"email":"gert.hulselmans@kuleuven.be","name":"Gert Hulselmans"},"message":"Use mlock2() instead of mlock() on Linux and use mlockall(MCL_FUTURE).\n\nUse mlock2() instead of mlock() on Linux:\n - mlock2() allows a third argument MLOCK_ONFAULT, which allow to\n lock the pages in a range before touching them, so mlock2()\n can be called before touching all pages instead of after.\n - \"vmtouch -t -l\" some_file would not keep the pages in cache\n after touching them, so at the lock() step they needed to\n be read again from disk. After this patch files are cached\n twice as fast on this system.\n\nUse mlockall(MCL_FUTURE) instead of mlockall(MCL_CURRENT):\n - Set mlockall(MCL_FUTURE) before touching pages to have similar\n behaviour as the mlock2() approach above.","distinct":true,"url":"https://api.github.com/repos/ghuls/vmtouch/commits/60fa5b0931b2bfffd7503c2b74704b446f3d523f"}]},"public":true,"created_at":"2024-03-02T21:44:41Z"}]