Last Updated: 16 Aug 2022 Views: 442 (After reading, tell us below if this answer was helpful)

How do I join data sets from two financial databases with VLOOKUP in Excel?


You may need to combine, merge or join data sets from multiple financial database sources such as Bloomberg, Capital IQ, Eikon, Datastream or WRDS. Perhaps you want to get credit risk or news sentiment scores from Bloomberg, combined with ESG information from Eikon, then put them together for all companies in the FTSE 350. Joining data sets is a large, complex topic with many potential pitfalls.

You need to have a common identifier between each set, for example, ISIN, LEI, CUSIP or GVKEY. The availability of different identifiers varies between suppliers, regions, company status and time.

There are several tools to consider, including VLOOKUP in Excel, Stata, SAS, Python or R. Although VLOOKUP may be the easiest, the other tools let you write scripts to better trace your steps and support reproducible methods. 

Here are just a few tips which may help you to use VLOOKUP.

 

Don't use tickers

Company tickers are the short codes which resemble the company name in some way, such as 'AAPL' for Apple or 'TSCO' for Tesco. These codes may be very convenient when working within one platform. However, there are several downsides.

  • Tickers are often not the same between different data providers.
  • Tickers can be reused to refer to different companies after they change name or structure.
  • Tickers can be limited to specific stock exchanges, making them useless if the exchange code is missing.

Instead, use a different company identifier code.

 

Get several types of identifier codes with your data

When you are running your initial data retrieval query, make sure to select as many identifier types as possible. You may not know at the time you download your data which other suppliers you will need to use. Each different data supplier may use a different company identifier code.

If you are using Capital IQ (web version), when you screen for companies, you have to explicitly add 'Excel Company ID' as an additional display column. You can later use Capital IQ (Excel version) with this Capital IQ Excel Company ID to retrieve other fields including ISIN.

 

Prepare your tables carefully

In this post, we will describe the two data sets as the 'left table' and the 'right table'. We will take the left table as the starting place, then add additional columns in from the right table. There will be one matching variable/column.

Note, if you want to match on more than one column (eg currency with year), you can combine them into a new column using Excel's concatenate function. Do this in both tables then match on the new column.

The identifier column has to be the first column in the right table. You may need to copy or cut and paste the column. 

It is easier if you have a copy of both tables in the same workbook, though not essential.

 

Naming the ranges or tables

You can use Excel's 'named range' feature to give a convenient name to the right table, rather than a long and clumsy sheet and cell range reference.

Alternatively, you can use the Excel 'create table' feature to set your right table as a true, Excel table. It will get a name of your choice, eg 'Table1'.

 

Writing the VLOOKUP formula

In the left table, create a new column where you want to bring in data from the right table.

Find the column in the left table with the identifier you want to match, eg column $B, where the $ means lock.

Find the number of columns into the right table with the variable you want to bring in, eg 4 columns to the right of the identifier.

Find the named range or table name of the right table, eg Table1.

In the first row after the header, type your VLOOKUP formula, eg row 2.

=VLOOKUP($B2, Table1, 4, FALSE)

where FALSE means you are doing an exact match on the identifier, and the other parameters are described above.

You can copy this formula down a few rows to check it worked.

You can then copy the formula down to the end (use the 'copy down' feature in Excel). 

You can then copy the formula right for another column, updating the column offset from 4 to your new selection.

 

Improving the VLOOKUP formula

This formula should work fine, however, it will not deal with missing data very neatly.

You could copy and paste the column as values (removing the formula), then find-replace the error messages with blank values.

Instead, you could extend the formula with the IFERROR formula, to show blank ("") if there is an error:

=IFERROR(VLOOKUP($B2, Table1, 4, FALSE), "")

 

Still stuck?

Please contact us if you need more help.