Download dataset

Current version: v1.0 (released in XX month XX year)

Terms of use: Uses are subject to Reddit API terms.

Click here to download v1.0 (XX GB)

This file contains JSON files containing

- README.txt
- TERMSOFUSE.txt
- metadata/
  ├── subreddit 1/
      ├── <reddit.post.id 1>.json
      ├── <reddit.post.id 2>.json
      ├── ...
      └── ...
  ├── subreddit 2/
  ├── ...
  └── ...

Metadata format

Each <reddit.post.id>.json contains metadata of image posts from a single subreddit, following this schema.

A sample example from the datatset:

{
    "title": "elon musk at my local barnes and noble",
    "url": "https://i.redd.it/a7oueb592bpb1.jpg",
    "comments": [
        {
            "author": "moxyfloxacin",
            "body": "i hope its pages and pages of the truth i doubt it though",
            "created_utc": "2023-09-19 20:46:15",
            "analysis": {
                "common nouns": {
                    "hope": 1,
                    "page": 2,
                    "doubt": 1,
                    "truth": 1
                },
                "emotion tag": "optimism",
                "attention scores": {
                    "hope": 1.0,
                    "doubt": 0.11211709678173065
                }
            }
        },
        {
          "author": "DataAstronaut_",
          "body": "just listened to walter isacsons podcast with lex friedman from the sound of him talking about elon on it should be an interesting read",
          "created_utc": "2023-09-19 21:22:38",
          "analysis": {
                "common nouns": {
                  "walter": 1,
                  "isacsons": 1,
                  "friedman": 1,
                  "elon": 1,
                  "podcast": 1,
                  "walter isacsons": 1,
                  "interesting read": 1,
                  "walter isacsons podcast": 1,
                  "isacsons podcast": 1
                },
                "verbs": {
                  "listen": 1,
                  "sound": 1,
                  "talk": 1,
                },
                "adjectives": {
                    "interesting": 1,
                    "lex friedman": 1
                },
                "emotion tag": "approval",
                "attention scores": {
                    "listened": 1.0,
                    "read": 0.2401873618364334,
                    "interesting": 0.10860899090766907,
                }
            }
        },
        ...
    ]
}
Maintained by: Karen Ye