Skip to main content

Launch HN: Narrator (YC S19) – a data modeling platform built on a single table https://ift.tt/34bcMD0

Launch HN: Narrator (YC S19) – a data modeling platform built on a single table Hi HN, We’re Ahmed, Cedric, Matt, and Mike from Narrator ( https://www.narrator.ai ). We’ve built a data platform that transforms all data in a data warehouse into a single 11-column data model and provides tools for analysts to quickly build any table for BI, reporting, and analysis on top of that model. Narrator initially grew out of our experience building a data platform for a team of 40 analysts and data scientists. The data warehouse, modeled as a star schema, grew to over 700 data models from 3000+ raw production tables. Every time we wanted to make a change or build a new analysis, it took forever as we had to deal with managing the complexity of these 700 different models. With all these layers of dependencies and stakeholders constantly demanding more data, we ended up making lots of mistakes (i.e. dashboard metrics not matching). These mistakes led to loss of trust and soon our stakeholders were off buying tools (Heap, Mixpanel, Amplitude, Wave Analytics, etc…) to do their own analysis. With a star schema (also core to recently IPO-ed Snowflake), you build the tables you need for reporting and BI on top of fact tables (what you want to measure, i.e. leads, sales…) and dimension tables (how you want to slice your data, i.e. gender, company, contract size…). Using this approach, the amount of fact and dimension tables grow in size and complexity in relation to the number of questions / datasets / metrics that need to be answered by the business. Over time the rate of new questions increases rapidly and data teams spend more time updating models and debugging mismatched numbers than answering data questions. What if instead of using the hundreds of fact and dimension tables in a star schema, we could use one table with all your customer data modeled as a collection of core customer actions (each a single source of truth), and combine them together to assemble any table at the moment the data analyst needs that table? Numbers would always match (single source of truth), any new question could be answered immediately without waiting on data engineering to build new fact and dimension tables (assembled when the data analyst needs it), and investigating issues would be easy (no nested dependencies of fact and dimension tables that depend on other tables). After several iterations, Narrator was born. Narrator uses a single 11-column table called the Activity Stream to represent all the data in your data warehouse. It’s built from sql transformations that transform a set of raw production tables (for example, Zendesk data) into activities (ticket opened, ticket closed, etc). Each row of the Activity Stream has a customer, a timestamp, an activity name, a unique identifier, and a bit of metadata describing it. Creating any table from this single model made up of activities that don’t obviously relate to each other is hard to imagine. Unlike star schema, we don’t use foreign keys (the direct relationships in relational databases that connect objects, like employee.company_id → company.id) because they don’t always exist when you’re dealing with data in multiple systems. Instead each activity has a customer identifier which we use, along with time, to automatically join within the single table to generate datasets. As an example, imagine you were investigating a single customer who called support. Did they visit the web site before that call? You’d look at that customer’s first web visit, and see if that person called before their next web visit. Now imagine finding all customers who behaved this way per month -- you’d have to take a drastically different approach with your current data tools. Narrator, by contrast, always joins data in terms of behavior. The same approach you take to investigate a single customer applies to all of them. For the above example you’d ask Narrator’s Dataset tool to show all users who visited the website and called before the next visit, grouped by month. We started as a consultancy to build out the approach and prove that this was possible. We supported eight companies per Narrator data analyst, and now we’re excited for more data folks to get their hands on it so y’all can experience the same benefits. We’d love to hear any feedback or answer any questions about our approach. We’ve been using it ourselves in production for three years, but only launched it to the public last week. We’ll answer any comments on this thread and can also set up a video chat for anyone who wants to go more in-depth. September 30, 2020 at 09:30PM

Comments

Popular posts from this blog

Show HN: Infstream – We’re trying to fix video monetization for creators https://ift.tt/34Rcd11

Show HN: Infstream – We’re trying to fix video monetization for creators TL;DR: https://ift.tt/2VFChrA Hi HN – we’re Ben & Callum from Infstream. We’ve always been heavy users of YouTube, for entertainment, education and sharing. Towards the end of last year, we saw more and more horror stories of YouTubers losing their livelihood to the ad algorithm. We decided to build a content-first video platform, which aims to reduce issues by removing advertisers from the equation. Instead, we charge for the content you watch – bold, I know. Instead of paying in advertising and data, users on Infstream build their own streaming package, a channel at a time. Anyone can start a channel (US & UK now, Europe soon) and earn directly from their subscribers. Subscribers pay $1 per month per channel, of which the channel receives $0.75. This all begins from the first subscriber, there are no minimums to start monetization. Channels have total control, and can publish on a daily, weekly or monthl...

Show HN: Teddy Bear Tracker iOS App https://ift.tt/34MIiHn

Show HN: Teddy Bear Tracker iOS App Two weeks ago when walking around my neighborhood I noticed a strange amount of teddy bears placed in the windows of homes. When I got home I searched the internet and found https://ift.tt/2URjc5m describing that this was being done to provide additional entertainment for people going on walks during these times of social distancing. This past week I decided to repurpose some old code into an app that would allow me to keep track of the teddy bears I found while on my own walks. It's quite simple but I hope others can get some enjoyment out of it! :) Here is the Apple App Store link: https://ift.tt/3al5kpV April 19, 2020 at 10:26PM

Show HN: MailPhantom – Keeping your email address invisible https://ift.tt/2Lc2z02

Show HN: MailPhantom – Keeping your email address invisible Been reading HN for some time now, but this would be my first post. https://ift.tt/2zkYqEh Copy and past from the site: ######### The use of unique password are considered best practice, why are we not doing this with email addresses as well. MailPhantom aims to achieve this, with an added benefit, you'll see which service providers or mailing lists are sharing your email addresses. ######### This is basically a MVP, and may likely break somewhere. But if there is a lot of interest I may build/work on it more. I have used it in its current state for a few months now. I welcome any feedback :) ^C May 10, 2020 at 12:59PM