The file name itself follows standard Linux archiving conventions:
: Security experts, including Binance CEO Changpeng Zhao, suggested the leak occurred due to a misconfigured ElasticSearch database that was left exposed on the internet without a password. Contents of the Dataset
: Journalists from the New York Times and The Wall Street Journal contacted individuals listed in the sample and confirmed that the details, including names, addresses, and police records, were accurate. shga sample 750k.tar.gz
The sample provided a snapshot of the sensitive information held by the Shanghai National Police. According to the original Breach Forums post , the broader database included:
: Detailed case reports and criminal records, ranging from minor traffic violations to major criminal investigations. The file name itself follows standard Linux archiving
By February 2025, researchers at SpyCloud reported that re-circulated copies of this dataset were still being traded in the underground, with modern iterations containing nearly 960 million rows of data. AI responses may include mistakes. Learn more 2022 - SHGA Shanghai Gov National Police database
: Records included individuals from across China, not just Shanghai, covering roughly 7.4% of China's total population . Technical Specifications of the File According to the original Breach Forums post ,
The file, originally uploaded to the now-defunct "Breach Forums" by a user named served as a proof-of-concept to verify the authenticity of a massive 23-terabyte dataset allegedly containing the personal information of 1 billion Chinese citizens . Origin and Significance of the 750k Sample
: A compressed archive format commonly used for large data transfers. Cybersecurity and Geopolitical Impact
: Full names, national ID numbers (resident identity cards), mobile phone numbers, birthplaces, and birthdates.