# Character Archive Torrent Part 2 *The Character Archive is operated by Cyberes - chub-archive@evulid.cc - [@cyberes:evulid.cc](https://matrix.to/#/@cyberes:evulid.cc) - [char-archive.evulid.cc](https://char-archive.evulid.cc)* **SNAPSHOT DATE:** January 5, 2025 **Snapshot Size:** 107G This is the second torrent of the character card archive from `char-archive.evulid.cc`. More info is available in the part 1 torrent. Takeout torrent part 1 is required. To create the complete archive, merge both `hashed-data` directories and import the latest SQL dump. Additionally, both `files` directories should be merged as well. #### Torrent File Download [char-archive_part-1.torrent](https://chub-archive.evulid.cc/api/file/download?path=/takeout/char-archive_part-1.torrent&download=true) [char-archive_part-2.torrent](https://chub-archive.evulid.cc/api/file/download?path=/takeout/char-archive_part-2.torrent&download=true) ## Data New data sources. ### RisuAI RisuRealm Character cards from `realm.risuai.net` are scraped daily. User info is not scraped. Cards are in the V3 format. RisuRealm has an awful API that serves incorrect data. The scraper tries its best to parse it but an unknwon number of cards fail and are skipped. ### Webring Archive of the Neocities [AI chatbot webring](https://chatbots.neocities.org/). Site owners can opt out via the `` HTML tag. ### Alternative Bot Websites Various bot-posting websites. `nyai.me`: Bot catalog with chan-like features. Associated with 4chan. ### Generic Character Cards The scrapers crawl the web searching for character cards. One primary source are files stored on catbox.moe If a chub.ai author is found in a card's metadata, it is added to the chub.ai archive. ### Imported Data Sources **Janitor AI**: 69k cards scraped from Janitor AI before they made card definitions private. This is an incomplete archive. A larger archive exists on [Huggingface](https://huggingface.co/datasets/AUTOMATIC/jaicards) that contains 190k cards. Although this larger archive will not be integrated into the Character Archive because JanitorAI cards are consistently low-quality, it has been included in our takeout torrents under the /third-party folder. **Pygmalion Discord Server**: this has not been imported into the archive and only exists in the [file browser](https://char-archive.evulid.cc/#/files.html?path=/third-party/Pygmalion+Discord+Server+04-18-2023). Contains cards from the Pygmalion Discord server up to 04-18-2023. **Roko's Basilisk**: scrape of Roko's Basilisk, an early but influental frontend for chatbots which shut down after a week over concerns regarding OpenAI's terms of service. Contains the defs of many CAI bots that remain private on character.ai. Predates chub.ai and SillyTavern. Authors, where found, have been imported. **VenusAI**: VenusAI up to 05-27-2023, scraped by Koreans. **VenusAI Official Discord Server**: cards from the official VenusAI Discord. This archive was created on 05-28-2023 and originally distributed as `ai_characters_archive.zip`. ## Contents `char_archive_database_01-5-2025.sql.7z`: the PostgreSQL database dump. `hashed-data_01-09-2025.7z`: the hashed data. `proxy_stats_index_01-08-2025.7z`: dump of the proxy stats Elasticsearch index using [elasticsearch-dump](https://github.com/elasticsearch-dump/elasticsearch-dump). Organization of `files/`: ``` files/ ├── [4.0K] historical │   └── [4.0K] jaicards │   ├── [ 10G] cards.7z.001 │   ├── [ 10G] cards.7z.002 │   ├── [ 10G] cards.7z.003 │   ├── [ 10G] cards.7z.004 │   ├── [ 10G] cards.7z.005 │   ├── [ 10G] cards.7z.006 │   ├── [1.5G] cards.7z.007 │   ├── [156M] html.7z │   ├── [ 14K] index.html │   ├── [188M] original json.7z │   └── [2.0K] README.md ├── [4.0K] logs │   └── [4.0K] Moxxie JanitorAI │   ├── [ 17M] Moxxie JanitorAI Logs 10.7z │   ├── [ 18M] Moxxie JanitorAI Logs 11.7z │   ├── [ 11M] Moxxie JanitorAI Logs 1.7z │   ├── [2.3M] Moxxie JanitorAI Logs 2.7z │   ├── [ 13M] Moxxie JanitorAI Logs 3.7z │   ├── [ 14M] Moxxie JanitorAI Logs 4.7z │   ├── [ 15M] Moxxie JanitorAI Logs 5.7z │   ├── [ 15M] Moxxie JanitorAI Logs 6.7z │   ├── [ 17M] Moxxie JanitorAI Logs 7.7z │   ├── [ 17M] Moxxie JanitorAI Logs 8.7z │   ├── [ 17M] Moxxie JanitorAI Logs 9.7z │   ├── [ 16M] Moxxie JanitorAI Mirror 1 Logs 1.7z │   └── [3.2K] README.md └── [4.0K] other ├── [ 36K] aicg cycle.png ├── [4.9M] A Single Cloud Compromise Can Feed an Army of AI Sex Bots - Krebs on Security.pdf ├── [ 47K] ballcourt.png ├── [ 824] crack-prompt.txt ├── [ 73K] dark role-playing services.png ├── [1.0M] Detecting AI resource-hijacking with Composite Alerts - Lacework.pdf ├── [2.6M] How do people use ChatGPT We analyzed real AI chatbot conversations - The Washington Post.pdf ├── [289K] Ian Ahl.png ├── [ 29K] jester777.md ├── [2.1M] Meta and OpenAI have spawned a wave of AI sex companions and some of them are children - Fortune.pdf ├── [262K] Sourcegraph aicg Hack.pdf ├── [282K] The Growing Dangers of LLMjacking - Sysdig.pdf └── [3.4M] When AI Gets Hijacked Exploiting Hosted Models for Dark Roleplaying - Permiso.pdf ```