How does Canopy Extract work?

Back to Category Page

EXTRACT TABLES FROM PDFS

All the data we need is in table format. Our data is invariably in table format.

Last Updated: ‎‎‎‎‏‏‎ ‎

January 24, 2023

All the data we need is in table format

Our data is invariably in table format. Typically we need to extract the following 3 tables from each PDF document

Holdings
Transactions
Current Account Credits and Debits

Canopy Extract is designed to extract any table (not just the 3 tables above) from any PDF document. In case you need to extract charts and images from a PDF document then Canopy Extract is not for you.

Extract needs the PDF document and an Excel Configuration file

To work the PDF Extract needs two files

PDF document to be extracted (e-PDF is preferred, but paper scans will also work)
Excel Configuration File (which describes the table to be extracted)

The Extract needs an Excel Configuration File (which describes the table to be extracted)

What does a Typical PDF document look like

Multilayer headers and nesting are the key issues while extracting data from a PDF table

What does an Excel Configuration file look like

The Excel Configuration file for the above table is given below. Further details are on page Parts of a Config File

Excel Configuration file to extract the Holdings table in the image above

How does Canopy Extract work?

All the data we need is in table format

Extract needs the PDF document and an Excel Configuration file

What does a Typical PDF document look like

What does an Excel Configuration file look like

All articles

All articles

Welcome to Canopy

We connect to every Bank in the world

List of Automated Data Feeds

Canopy Use Cases

Canopy Analytics Process

Segregated Database

Infinitely Configurable Calculations

High End Customized Reporting

Financial Metrics Dashboards

Extraction of tables from PDF documents

How does Canopy Extract work?

PDF Extract Demo (45 sec)

Canopy UL 2.0

List of Supported Formats for Canopy Extract

Evergrande likely a non-issue as investors started exiting in February

Selling across the board as ‘Fall’ season hits the markets

Interest rate ‘Netflix drama’ continues and signs of recovery in HKD equities

Interest Rate traders finally agree with Bond traders and everyone sells

Tech takes a backseat, China panic subsides as summer break continues

Canopy Data Cleansing Process

Typical Bugs Found in Custodian Statements

What are Structured Derivatives

Comprehensive Risk Report in Excel

Risk Report in Chart format

Integrations

Introduction to Accounts

Installing the Excel Add-in

Starting the Excel Add-in

Excel Add-in Menu Options

Generate Attribution Summary

Generate Strategy Transactions

Troubleshooting if Excel Add-in not working

Create Strategies

Single Line Transactions

Consolidated Holdings

Strategy Definitions

Strategy Records

Private Equity Dashboards

Real Estate Dashboards

Making a New Dashboard

Explainer (Most Used Feature)

Meta Data Analytics

Fund Look Through

Demo of the Analytics Process

Preparing Your Bank Statements for Upload

What Information to Anonymize

How to Anonymize Your Statements

Canopy Visualizer Features

Investors make a U turn

Investors buy Equities and sell Bonds even as interest rates fall

Investors ‘wait and watch’ as Fed signals earlier rate hike

Signs of activity and a preference for Asian assets as USD loses value

Investors relax while Bitcoin blows up

How Personalised Client Reporting Has Changed Wealth Management

How Right Technology Can Be Transformational For Family Offices

3 common applications of AI in asset management