Our great sponsors
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Public traded companies in the US have to submit annual (10-k) reports to the SEC. These reports contain lots of interesting information, including a detailed description of the company’s business as well as risk factors that the company faces, and so are an important resource for investment research. These reports are publicly available on the SEC website, and the SEC has some advanced search functionality (https://www.sec.gov/edgar/search/). However, from my experience the SEC search tool is more helpful if you already know what you are looking for. I wanted to search broadly among companies based on terms that appear in the Item 1. Business and Item 1A. Risk Factors sections of the 10-K annual report. The sample searches from the home page offer interesting examples of how to use the tool. In one capacity you can think of this as like “Google Trends” for SEC 10-k annual reports.
Currently my data set includes almost all annual reports for the last 20 years for companies belonging to the S&P 500 or Russell 2000 stock indices, with plans to expand coverage to other companies as well. The project (https://github.com/kyleleelarson/sec-search) is written in Go (web scraping and data management separately in Python), using Elasticsearch as the search backend. I am using Cloudflare and a GCP instance to host.
Comments and suggestions are appreciated.