查詢NCBI時太多請求錯誤


1

查詢NCBI SRA數據庫時,我始終收到"請求過多錯誤",即使我每秒運行少於10個請求,並且我有一個API密鑰,據說應該可以讓我每秒運行10個。

這是我的(Python)代碼:

import subprocess
import concurrent.futures
import time

bioprojects = [
    "PRJNA644722",
    "PRJNA644892"
]

all_metadata_generators = []

def fetch_metadata(bioproject):
    return subprocess.run(f"esearch -db sra -query '{bioproject}[bioproject]' | efetch -format runinfo", shell=True, capture_output=True, text=True)

with concurrent.futures.ThreadPoolExecutor() as executor:
    for i in range(0, len(bioprojects), 10):
        all_metadata_generators.append(executor.map(fetch_metadata, bioprojects[i:i+10]))
        time.sleep(1)


for metadata_generator in all_metadata_generators:
    for metadata in metadata_generator:
        print(metadata)

1

The execution of your EDirect commands may be limited unless you are using an API Key, which you can create from your MyNCBI account. Your API Key can be used in one of several different ways.

In an argument to an EDirect command:

esearch -db nuccore -query 'some query' -api_key 12345

In a UNIX environment:

You can either run this just before you are running some edirect commands or add the following line to your .bashrc file.

export NCBI_API_KEY=12345

In the environment, from a Perl script:

$ENV{NCBI_API_KEY} = "12345"

0

As @vkkodali pointed out, the problem turned out to be that you shouldn't use multithreading.