Do you need to unveil patterns in company behaviour, maintain tabs on an organization of curiosity, or acquire entry to free and correct monetary knowledge for open supply analysis? In that case, you need to think about using Digital Knowledge Gathering, Evaluation, and Retrieval, or EDGAR.
This database of company and monetary knowledge maintained by america Securities and Alternate Fee (SEC) and accessible free of charge, comprises thousands and thousands of filings by public corporations going again so far as 1994., And but it stays underused by journalists, researchers, and monetary analysts.
One purpose it’s underused is that many potential customers are merely unaware of EDGAR and find yourself subscribing to a number of company knowledge suppliers. One other is that they’re all too aware of EDGAR, particularly the irritating shortcomings of its interface:
- Customers can not obtain without delay all of the paperwork returned by a textual content search of the database
- Owing to shifting knowledge tags, it’s extraordinarily cumbersome to create monetary profiles for single corporations or to carry out comparisons between their respective monetary metrics
- There isn’t a function for subscribing to a single RSS feed of a number of corporations which will curiosity you
To deal with these shortcomings, now we have developed a set of instruments that we hope will encourage extra individuals to make use of EDGAR for company and monetary analysis. These instruments enable customers to programmatically save the outcomes of search phrases in EDGAR, create a monetary profile of each firm traded on a US alternate, and filter EDGAR’s broadest RSS feed by corporations of curiosity.
Our aim was to make these sources accessible to as broad of an viewers as doable and so, excluding our monetary knowledge tables, solely primary data of Python is required. You possibly can entry our instruments right here.
Seek for Phrases within the EDGAR Database and Programmatically Save the Outcomes
The primary of our instruments makes it easy to file search outcomes from EDGAR, surfacing knowledge free of charge that others pay exorbitant quantities of cash to get from third-party distributors.
For instance, a recurring theme within the Wall Road Journal is the declining significance of Environmental, Social, Governance (ESG) ideas, as represented by the variety of quarterly earnings calls wherein firm management addresses ESG points. ESG encompasses funding methods that contact on company social duty, and might embrace every thing from investments in carbon offsets and inexperienced know-how to methods that keep away from investing in provide chains with subpar labour requirements. In June and September articles on the subject, the WSJ attributed knowledge displaying the decline in mentions of ESG to executives to AlphaSense and FactSet respectively. Each providers’ subscriptions price a number of thousand of {dollars} per 12 months.
Thankfully, we are able to use the textual content search device to indicate an analogous ESG pattern free of charge by looking for the time period “ESG” inside annual and quarterly studies filed with EDGAR:
With this knowledge, we are able to additionally spotlight the businesses that the majority ceaselessly point out ESG:
The usage of the EDGAR database to indicate broad traits throughout a whole lot of 1000’s of filings is feasible as a result of our device robotically breaks up the search into manageable chunks, crawls via every web page of outcomes, and appends the information into .csv format, which might be additional exploited in Excel or, as on this occasion, the Plotly library in Python.
Every consequence row within the returned .csv desk consists of the date, jurisdiction the place the authorized entity is registered, its principal office, its identify, distinctive figuring out quantity or Central Index Key (CIK), a URL for the submitting index (which can embrace extra paperwork and knowledge), and the hyperlink to the submitting itself. As demonstrated with the ESG knowledge, it generally isn’t even essential to open the filings themselves to find fascinating patterns – the desk knowledge is adequate.
Acquire a Full Monetary Profile on Any Firm Traded on a US Alternate
Our second device lets customers create a novel profile of any firm that’s traded on a US Alternate. On simply the New York Inventory Alternate and the Nasdaq Inventory Market, there are over 2300 and 3600 listed corporations, respectively.
Each firm whose shares can be found to the general public should periodically report their financials to the SEC. This monetary knowledge is included throughout the textual content and .htm variations of the filings and can be saved in XBRL format, which makes use of a system of information tags, or taxonomies, to make sure that knowledge factors are constant throughout time and throughout totally different corporations.
For instance, when an organization tracks the variety of excellent shares it had throughout a reporting interval, the textual content and tables of its report would possibly use the phrases “widespread shares,” “excellent shares,” or “primary shares.” Nevertheless, Inside the XBRL doc, an organization’s variety of excellent shares is likely to be related to any of round a dozen tags chosen by their accountants.
These tags could change from 12 months to 12 months and from firm to firm, which makes EDGAR’s datasets of economic knowledge of little use with out processing. By means of the examine of taxonomy paperwork printed by the XBRL basis, a crash course in accounting, and using semantic and numeric matching, we have been capable of get hold of a common schema for matching up a plain English time period for round 100 generally used monetary knowledge factors.
Utilizing this time period matching library as a reference, we created a single desk of economic knowledge masking all the businesses that report back to the SEC. We’re repeatedly working to enhance the desk to handle notable areas of enchancment, together with:
- Parsing knowledge for non-US corporations
- Dealing with lacking knowledge / typos in EDGAR knowledge
- Lapses within the accuracy of the present parsing technique
In its present type, the information set permits us to generate coherent and correct knowledge time collection for many corporations, and for many years. Right here we see the declining internet earnings (revenue) of economic providers agency Morningstar, which as we noticed above, is among the corporations most related to ESG investing:
RSS Feed Customisation
Our closing device lets customers floor beneficial data from EDGAR’s RSS feeds, which presently make it onerous to trace up-to-date data on particular person corporations.
EDGAR publishes a number of RSS feeds that present a each day overview of latest filings. Whereas it’s doable to subscribe to particular person firm feeds, following a number of corporations via a number of subscriptions is burdensome.
We discover it extra sensible to entry one of many fundamental RSS feeds after which filter the outcomes by corporations of curiosity.
To this finish, we’ve constructed a easy device that enables the consumer to enter an inventory of shares of curiosity, after which ship a request to EDGAR’s broadest RSS feed, together with each XBRL and non-XBRL filings. The device returns a CSV file in a format much like the textual content search device, with entity date, entity identify, ticker, submitting sort, CIK quantity, and hyperlinks to the submitting and its index.
As with the textual content search device, the RSS device is helpful for analysis queries with both slim or vast focuses. For instance, we are able to maintain tabs on a bunch of corporations of curiosity, or alternatively, we are able to use the unfiltered each day outcomes to maintain a finger on the heart beat of the broader market. One notably fruitful use case is to feed all the each day filings right into a LLM to supply and retailer concise summaries of the submitting texts.
A Remaining Be aware
These instruments are a piece in progress and should should be tailored to potential modifications within the scope and format of EDGAR. Moreover, we hope to adapt and develop upon this toolset in response to your suggestions and prompt use circumstances. Please attain out to both Bellingcat or fellow George Dyer of Market Inference together with your feedback and questions, and we are going to gladly help you.
Our aim is to make sure that the EDGAR database is getting used to its fullest extent by the widest doable group of individuals.
Bellingcat is a non-profit and the flexibility to hold out our work relies on the sort help of particular person donors. If you want to help our work, you are able to do so right here. It’s also possible to subscribe to our Patreon channel right here. Subscribe to our Publication and comply with us on Instagram right here, X right here and Mastodon right here.