Can I scale beyond Google Sheets to my favorite BI tool?
Connections also allow you to connect a dedicated DataHawk database to your BI tools and other destinations, allowing you access to our rich, raw data. These tools include Power BI, Looker, Tableau, and more. Click here for a complete list of compatible tools.
What kind of data will you be able to access?
All of the data you track on your DataHawk account through the app is also available with Connections, including:
Tracked products and keywords (listings, sales ranks, organic ranks, sponsored ranks on a daily basis)
Seller Central account data (orders, profit & loss, financial events)
Advertising data (aggregated per account, campaign or product)
And of course, the data will come with custom-made KPIs (i.e., sales estimates, keyword search volume)
Set-Up & Quickstart
Setting up is easy and involves only a few steps. Follow the step-by-step guide:
How to navigate the database?
Once connected to your BI tool, you'll discover six different schemas/folders filled with specific tables.
Overview of the accessible schemas
Referential - everything related to your DataHawk account and more!
Product - monitor and benchmark changes in products details and performance
SEO - track, benchmark, and optimize products' organic and sponsored search performance
Finance [requires a Seller Central source] - run easier 3P sales reporting and correlate sales data with other metrics
Advertising [requires an Advertising source] - monitor and analyze your ads performance
Market - [requires a Category source] - monitor daily best seller rank and estimated sales for the top 100 products of a chosen category with a two-year history
Datafeeds - specific data feeds used for google sheets automatized dashboards
Important Data Specifications
Available data in the database depends on what is currently active on your DataHawk workspace: tracked products, tracked keywords & connected accounts.
For ease of understanding and documentation purposes, the datasets here are referred to as "tables" instead of views.
Collection Rate: We collect tracked items daily from Amazon or Walmart.
Refresh Rate: We update your database on an hourly basis.
Often Used Columns: Here is a list of frequently used columns within the tables and their definition/purpose:
workspace_idis your DataHawk workspace identifier.
marketplace_keyis the marketplace identifier (Amazon-US, Amazon-FR, Walmart-US, etc.).
channel_product_idis the product identifier of a platform (ASIN for Amazon, walmart_product_id for Walmart).
product_keyis the unique identifier of a channel_product_id in a specific marketplace. (channel_product_id + marketplace_key ⇒ product_key)
keyword_keyis the unique identifier of a keyword in a specific marketplace. A best practice is to use this table to join keyword-related tables. (keyword + marketplace_key ⇒ keyword_key)
Finance & Advertising:
account_keyis the Amazon Seller Central identifier of a connected account. The same identifier is also used on Amazon Advertising accounts.
Sales ranks & browse nodes:
channel_browse_node_idis the browse node identifier of a platform (amazon_id for Amazon, categoryId for Walmart).
browse_node_idis our internal identifier for a browse node on Amazon or Walmart in a specific marketplace. It is defined by a channel_browse_node_id and a marketplace. (channel_browse_node_id + marketplace_key ⇒ browse_node_id)
All the timestamps columns are in UTC, except when explicitly stated otherwise (i.e., purchase_local_time).
observation_timeis the exact point of time when we observed the item (product, keyword, or rank) on the targeted marketplace. Due to the timestamp details, avoid using this column to join different data sources, as you'll risk losing some information.
observation_dateis the date when we observed the item. It is a column useful for joins between data sources as it is on a date level rather than a timestamp level.
date_dayis the date column we use for aggregated tables. Some of our tables are not based on events or observed web pages. Instead, they are based on an aggregated reports fetched (i.e., Advertising data). We use date_day as a date column for all types of daily aggregations.
date_monthhas the same purpose as date_day but on a monthly basis. It shows the date of the first day of the month, and we use it for monthly aggregated tables .
purchase_time / purchase_date / purchase_local_timeare the date columns used for Seller Central orders. For order data, we have detailed information that goes as deep as the actual timestamp at which the order went through.
event_dateis the date column used for detailed financial events.
sync_timeis a technical column. It displays the timestamp when we ingested the data in our data warehouse. Sometimes data gets reprocessed for quality reasons, due to which this column gets updated. In addition to it being made available for data engineering reasons, it also ensures transparency about the freshness of the data, along with the time it took for it to go from collection to availability.
Referential - Everything related to your DataHawk workspace & more!
This schema contains:
Information relative to your DataHawk workspace: tracked items, tags, Seller Central & Advertising accounts connected, projects.
Repositories about marketplaces, browse node tree, and currency rate. These are useful to add extra information to other data sources, such as browse node details for sales rank data or currency rates to sales data.
This table lists all the Advertising accounts you connected to DataHawk and general information about these accounts. Connecting these accounts will populate everything under Advertising.
List of browse nodes you are currently tracking on the DataHawk's Connections.
A browse node is defined by a name (browse_node_name) and an id (browse_node_id). This will be enabled soon for this version of Connections.
This table is our referential of all browse nodes on Amazon and Walmart and their hierarchies/paths. You can find information such as the path, the node level, the name or the last observation date.
A browse node is defined by its browse_node_id and can have several paths to get there.
This table is a matrix of dates and currency rates.
Since the first of January 2016, you have a currency, a base, and a rate for every date; the base is always USD.
All the data in the Finance and Advertising tables are in the marketplace's currency. If you want it in dollars, you can divide the sales column by the rate.
Here's a sample query that gets the sales in dollars for one of the Finance tables.
select f.*, f.sales/c.rate as sales_in_dollars from "DATAHAWK_SHARE_12345"."REFERENTIAL"."REFERENTIAL_CURRENCY_RATE" as C join "DATAHAWK_SHARE_12345"."FINANCE"."FINANCE_ACCOUNT_METRICS" as F on f.date_day=c.date_day and f.currency=c.currency
On the 27th of February 2022, 1 USD was worth 91.77 AFN.
This table lists all the Seller Central (Finance) accounts you connected to DataHawk. Connecting these accounts will populate everything under FINANCE.
This table lists all the keywords tracked and the corresponding tags you've put on DataHawk.
This table lists all the keywords you tracked on DataHawk and information about when you started tracking them.
This is the list of all covered marketplaces and relative information about them.
This table lists all the products tracked and the corresponding tags you have assigned on DataHawk.
This table lists all the products you are currently tracking on DataHawk. If you want to enable the PRODUCT tables, you need to have tracked products on the platform beforehand.
All the projects you've created on DataHawk and the keywords and products within each of them.
Product - Monitor and benchmark changes in products details and performance
This table contains all the different badges' name and type that we observed on your products on any given day since they were tracked.
This table contains historical data of the daily product listings for all the products you track on our platform.
Every day we fetch the following information:
Marketplace, channel product id, group channel product id, date, time, name, price, brand, rating, rating count, buy box quantity, sales estimates, variations, image count, description, bullet points, rating percentages, A+ detection, dimensions, weight.
The following screenshot only shows a small part of it.
This table looks at all of DataHawk's data over the last 7 days and counts every time we saw your products appear in all of DataHawk's keywords.
This allows you to quickly know which keywords are of interest and that you might want to track.
This table shows a daily Listing Quality Score of your tracked products. This homemade KPI scans your product listings to give you an elaborate score.
It unveils a breakdown of the listing quality score for you to understand which part of your listing could be improved and how the score is being computed. The following screenshot shows a small part of the details available in this table.
This table contains all the different browse nodes and sales ranks that we observed on your products on any given day since they were tracked.
NB: We are sourcing this from 2 different parts of Amazon: the product listings and the top 100 Best Seller Rank. As for Walmart, this is only sourced from the top 100 BSR.
This table shows the latest version of the variations of your tracked products. For every product tracked (channel_product_id), you have its latest parent (group_channel_product_id). This is, again, scoped to a marketplace (marketplace_key).
SEO - Track, benchmark, and optimize products' organic search performance
A homemade KPI that estimates the search volume of the keywords you are tracking on a monthly basis.
Daily organic ranking (pages 1 to 5) of your tracked keywords and all visible data on the keyword page (channel product id, product name, number of ratings, rating, prime badge).
Daily sponsored ranking page of your tracked keywords along with all visible data on the keyword page (ASIN, product name, number of ratings, rating, prime badge).
Finance [requires a Seller Central source] - Run easier 3P sales reporting and correlate sales data with other metrics
This table shows all the aggregate of successful orders that occurred for a certain account per day. It's a high-level aggregated dashboard to follow what's happening on an account level.
This table lists all financial events by day, account key, and marketplace.
Some financial events can be attached to an order id and an ASIN. The column metric maps an Amazon Financial Event with its amount.
This is a detailed table to help you go as deep as a reimbursed gift wrap event.
This table shows all the orders (successful & canceled) that occurred for a specific account per day. The main difference with the FINANCE_ACCOUNT_METRICS table is that this isn't aggregated data. It shows the information on an order level with columns like PURCHASE_TIME, showing the actual time that the order went through.
This table shows all the successful orders that occurred for a specific account per day. It is the same table as FINANCE_ORDERS, except that it's filtered to show only the successful (status = "Shipped") orders.
This table aggregates orders on a product level daily. It's built to follow the performance of a product every day: number of orders, number of units sold, sales.
This table aggregates orders on a product monthly level. It's built to follow the performance of a product every month: number of orders, number of units sold, sales.
This table aggregates all profit & loss metrics and their amount per account key and day.
Advertising [requires an Advertising source] - Monitor and analyze your Ads performance
Advertising data aggregated by account and date, and sponsoring type.
Advertising data aggregated by account, campaign, date, and sponsoring type.
Advertising data aggregated by account, date, product, and sponsoring type.
Marketing [requires a Category source] - Monitor best seller rank and estimated sales changes of a tracked category
Daily estimated sales of products per tracked category.
Monthly total estimated sales per tracked category.
Daily sales ranks of products per tracked category.
Datafeeds - Access advanced views built for automated reports on Google Sheets
This schema carries the datasets we use to fill our automated reports on Google Sheets. They're structured in a compatible manner and, for clients that have a vast amount of Sources, may have truncated data. So be sure to use the other schemas if you've scaled past this point. To check, the first value of the first cell of any dataset in this schema would read "Truncated".
You can see which Google Sheets dashboard it is referring to by looking at the name of the table.
feed_profit_loss_product_detailed ⇒ profit & loss dashboard, product detailed table.