The Pipes and Plumbing
Data Engineers (DEs) are responsible for the pipes and plumbing that enables BIEs to work their magic.
Data that hasn’t been properly organized and cleaned can be very dangerous. Not only is the data hard to analyze, but the conclusions could be incorrect. Data Engineering is the practice of collecting, cleaning, storing the data, and making it available to end users. Further transformations can be completed using Extract, Transform, and Load (ETL) methods. BIEs and Business Analysts rely heavily on Data Engineers to construct a data platform that is easy to use, accurate, and scalable.
Tony started working with the Central FP&A team of a major tech company. At the time, the entire group was doing everything in Excel and emailing out PDF reports to senior leaders. This was time consuming, error prone, and a poor customer experience.
Tony set up an AWS account and plugged into existing data infrastructure through AWS Lake Formation and S3. He applied further transformations on the data with Python scripts and SQL using AWS Glue. This datalake architecture built the foundation for many dashboards that serverd hundreds of users across the organization. See more detail on the T&E Dashboard or Click Through Financials Web Application.
One of the worlds largest ETF providers lacked the ability to readily access data across the $7T ETF landscape. Users would have to go into Bloomberg Terminals and download data into Excel. This was especially time intensive when Senior Management asked for a detailed analysis of competitor funds. Furthermore, the data coming from Bloomberg often contained errors that would go unnoticed.
Tony built data pipelines that programmatically brought in data from Bloomberg and multiple other financial data providers. He combined the data, reconciling differences to identify errors and reported issues back to the data providers. He then organized the data and created multiple lightweight files that could be quickly ingested into an ETF Dashboard making the output appear real-time to the end user.