Back to Engineering Blog

RampUp 2020: RampUp for Developers Recap – The Architecture and Future Direction of the LiveRamp Platform

  • 21 min read

RampUp 2020: RampUp for Developers Recap – The Architecture and Future Direction of the LiveRamp Platform

Speaker:

  • Andrew McVeigh – Chief Architect at LiveRamp

Our second RampUp for Developers session focused on the LiveRamp platform. LiveRamp’s platform processes many trillions of graph edges in order to deliver segments to hundreds of destination platforms. Powered by over 600 million unique online authentication events per month across our match network, our deterministic identity graph is the largest on the open internet, enabling us to build robust customer profiles and reach customers with the highest degree of accuracy. In this session, Andrew shows how this is performed in an efficient manner whilst preserving privacy and security, and explains where the architecture is evolving.

You can watch the video below, and can read the transcript below that. You can also download the slides from the presentation (which also appear throughout the video) here.

https://player.vimeo.com/video/397272444


Andrew McVeigh: Okay, fantastic. Let’s just test it out. Okay. So my name is Andrew McVeigh. I’m the Chief Architect at LiveRamp. I’ve been here for a year and a bit of change, and today I’m going to talk about the LiveRamp Platform: The Architecture & Future Direction.
Andrew McVeigh: Now I know because we’ve got LiveRamp people in the audience that the platform has many meanings in the company. I’m going to talk about a particular part of the LiveRamp platform, and I’m going to also introduce a set of terms and talk about some of the system schematics so that hopefully it will make the rest of the presentations flow more smoothly. So we’ve got some fantastic content coming up on stuff like our approach to CCPA, our approach to APIs, lookalike modeling, et cetera. We’ve got some really interesting stuff coming up, and I wanted to use this presentation to set the scene effectively to explain where we’re taking the platform and what the drivers are.
Andrew McVeigh: So a little bit about me. I’ve been writing software professionally for, sadly, 35 years now. And I first sold my first software when I was in high school. I trained as an electronics engineer for my undergraduate degree and I worked for about ten years in computational linguistics and telecommunications in Australia. We went to London, traveled to London, and decided to stay there and I spent around 15 years working in financial services, did a lot of work on investment banks in trading and risk management systems and took a bit of a break out at one point to be the lead architect for the first version of the American Express Blue Card, the smart card issuance system. And basically we built that system out. And what they said to us… And it was the company that built it out was not American Express, it was contracted to American Express, and they said to me, “We want to sell this to lots of companies. Don’t make an American Express specific.” And so really what they were asking for is, “Give us a platform. You’ve given us a product but make it into a platform.” And so that led to a lot of platform and product thinking. And as a result of that, I ended up doing a PhD on software extensibility, which is directly relevant to platform. So a lot of the terminology for platforms came from around that time.
Andrew McVeigh: About 10 years ago, I got approached by a company called Riot Games. Does anyone play League of Legends? Put up your hands if you’re a gamer. Okay. You’ve got about two or three people. League of Legends is a very well known online game. It’s played about two-and-a-half billion hours a month by over a hundred million active monthly users. And they’re mostly in China playing lots and lots of games at the moment because of the coronavirus. But sorry, that’s probably possibly in poor taste. Please ignore that joke. So I worked on League of Legends for around four-and-a-half years, and I started there and started working on their platform, built a lot of the tooling for APIs and a lot of tooling for internal communications. And when I finished on that, the software that I built was used by over 200 of the internal projects. So again, a lot of experience in what it takes to make high-performance communications. When I finished that, I worked and re-architected the League of Legends game client, which is on over 180 million desktops now. So that was a very different world, but a very interesting one. And interestingly, we actually had APIs internal to the game client, which developers could use.
Andrew McVeigh: After four-and-a-half years at Riot Games, decided it was a bit of a change, joined Hulu and became the lead architect at Hulu, so chief architect. So basically, Hulu was starting their push into live TV and I joined to spearhead that. That was a fascinating time. Over the two years I was there, we went through an amazing series of events, and basically we developed the live TV offering in 15 months. It was an incredible crunch, but one of the best times of my life in terms of being able to get some software out and seeing the whole team bond. We released the server and the client side on a single day, and it’s now, I believe, the largest internet delivered TV service in the United States.
Andrew McVeigh: Looking for a different challenge, I joined LiveRamp a bit over a year ago as its chief architect. So I’m presuming that most people know what LiveRamp does, but I didn’t before I’d actually heard of LiveRamp. So about a year-and-a-half ago, I had to learn effectively what LiveRamp did. I’m going to give you a brief summary of what LiveRamp does just in case you don’t know. And I’m going to use it as a jumping off point to talk about how we’re transitioning from a product into a platform.
Andrew McVeigh: So basically, LiveRamp resolves identities at massive scale. So looking at this diagram here, on the left hand side, brands push their data into LiveRamp and then they use that to send it out to advertising destinations so that they can do targeted advertising, people-based advertising. This is our main use case at the moment, which is called onboarding. And as I discuss further on, onboarding is just one application for our identity resolution facilities. And one of the reasons we want to evolve from a product into a platform approach is so we can unlock that vast spectrum of applications which can use that identity resolution facility.
Andrew McVeigh: So looking at it on the left hand side… And let me take a concrete example. Macy’s will have, for instance, a mailing list or a collection of people who have bought trousers in their company in the last six months. So there might be 30,000 people who’ve bought trousers at Macy’s in the last six months. And now they want to advertise shoes to them. So what they do is they send in a list of all of the information they have on these customers. This is from loyalty programs, from all sorts of information that you’ve given to Macy’s when you’ve bought your trousers and you send the data into us. Now we take that data and we use what’s known as the offline identity graph to strip out all PII. After the offline identity stage, after the ingestion stage, a person in our system is represented as a completely anonymous number. There’s no ability to go back from that anonymous number back into identifiable PII. And that, in a very real way, makes us very different from very many other approaches in the market.
Andrew McVeigh: So our system allows for this anonymous identity to travel around and be joined with other people’s data segments in a completely anonymous way. Now the identity that we give people from the offline graph is known as an identity link. This is a very strong link to the notion of an anonymous person, and we can tie it to a set of attributes, which allow us then to target it without anyone ever knowing who you are. What happens then is, say, Macy’s will send in their segment data and they’ll say, “We want to only target people who’ve bought trousers who are in an urban center.” They then combine it with a segment from a third party, which has similarly been anonymized. Now they’ve all been turned into identity links. We have a very powerful concept known as PELs (partner encoded links), where we can send that data out, again, to the partner in completely anonymized form, in a way that can never be combined between Macy’s and a third party. So basically, we’ve got a very powerful way to deal with anonymization.
Andrew McVeigh: Once it’s in the offline side, it’s in a set of segments, and a segment is basically a collection of people. Erin’s laughing at me. Many of these are very crude approximations of the real terminology. But basically, a segment is a collection of people that you want to target for advertising purposes. So you take that segment, which is all of the people who’ve bought trousers, and now you merge it with other segments and then you decide you want to deliver the segment. Now that’s where the online part comes in. This is where… The online part, it holds all of the information of the destination IDs. So we need to translate from our internal identity link representation into a partner ID, for instance, Google cookies so that we can send it to Google in terms they understand. And to do that we collect a lot of online data.
Andrew McVeigh: So the offline data is basically things like name and postal, when you filled out a loyalty program, et cetera. The online data is all of your online activity. And what we do then is we take that segment, we turn it into the destination identifiers through a process of activation, we call activation. And then we can send it out to one of 500 advertising destinations. And we handle the burden of the integration so that the clients don’t have to handle any of that. So to put it in a sort of more concrete form, you purchase a pair of trousers at Macy’s, you’re sitting at home two weeks later watching TV and suddenly an ad pops up for shoes or shirts depending on what they want to target you. That’s the power of the platform.
Andrew McVeigh: So we operate at massive scale and you can see here that we’re operating the level of the really big players in the space. We’re all at around 230 to 250 million identifiable people in our graphs. Now where LiveRamp differs is we were trying to enable an open web where we’re breaking that very tight link between content and advertising. We’re giving publishers options, and also brands options, so that we want to promote higher quality content and give the advertisers a lot more options where they target.
Andrew McVeigh: I’m going to refer to this schematic quite a few times in the rest of the slides and this is a very, very crude approximation of our systems. And the platform I’m going to talk about today is the platform that has got the blue line flowing through it all the way from the left side to the right side. This is a very crude approximation of our onboarding flow. The one I talked about where we onboard data, Macy’s onboards data for trousers, people who’ve purchased trousers, and then we send it out to many destinations for advertising. At the bottom layer here we have the offline identity graph and the online identity graph. These represent platforms in their own right. I’m not going to specifically refer to these platforms. Instead, I’m going to refer to the four boxes above these.
Andrew McVeigh: So what happens is data comes into our system as, say, a file and it may have 10,000 segments encoded in it. It goes through the ingestion process, which talks to the offline identity side known as Abilitec. It’s a very mature system. It basically turns it into anonymous identifiers at this point which represent a collection of people with a certain attribute. Like one segment might be dog lovers and other might be cat lovers and other might be people who own luxury cars. Ingestion passes it onto data access and data access is a place where we store and manage segments, so those marketing segments – they are stored in managing data access. We can combine them, we can look at overlaps and in particular, in our platform, we use a various set of sampling techniques to allow us to look at estimated overlaps across massive data sets. So you get a number like this, data says overlap by 7%.
Andrew McVeigh: Now what happens is when you’ve combined the segments and decide you want to deliver them, then it gets passed over to activations, which uses the online identity graph called AIM. It uses it to turn into the destination identifies so we can distribute it out to places like Google, Facebook, MediaMath; all those 500 different destinations. So that’s our onboarding flow, and it does involve massive scale. Now the scale I’m going to show you on this one here, the numbers, are a rough approximation of the numbers that were shown before. So this is simplified to a large extent, but as Sasha and his team alluded to before in the GCP migration panel, we routinely use 50,000 to 100,000 GCP calls. Now it goes down when we optimize, it goes up when we get more traffic, so it’s a bit of a continual balancing act to make sure that we can use our resources efficiently.
Andrew McVeigh: Now looking at the bottom layer, the offline identity side and the online identity side, in offline identity, we have around a hundred billion nodes and around a trillion edges, which allow us to resolve down to 240 million people in the US. So we can very accurately track when someone has moved address, for instance. We know how to find, how to track that as a single person and this is a system that performs that anonymization step that is so important. Then we have the online identity side, which consists of all sorts of online information. It consists of lots of authenticated traffic from our ATS solution. It consists of mobile device traffic and also cookie traffic. So we’ve got all that in the online identity side.
Andrew McVeigh: Then ingestion, we ingest around 120 terabytes of files a day. We turn that into many segments, but we hold around 10 million segments at rest. And some of these segments are very, very large, they’re like 100 million different identities. Then when we activate out, we’ve got around 10 million records per request. That’s a segment for delivery. We have to match with up to 100 billion partner IDs for say a destination as large as Google. And we process around 30,000 batches per day and we’re processing around 500 to 100 terabytes every 10 minutes or so. So a lot of numbers and then it goes to destination where we’re sending out 120 terabytes a day or so, and we’re sending it out to 500 different destination types. We have enormous amount of investment in different integrations with different destinations for advertising, which takes the burden off the customer. So all of that together is fairly vast system scale. And the reason I’m telling you this is that turning this system into a platform necessarily involves finding places to cut in the product where we can avoid compromising efficiency too much.
Andrew McVeigh: Sorry, just checking my times. Perfect.
Andrew McVeigh: So we’ve got a very strong market fit with our onboarding product. It’s very, very powerful. It’s very, very strong. A lot of people use it and to very good effect. But as we’ve been using it, we’ve realized it’s not the only use case for identity resolution. As we’re starting to build more and more on this, even internally, we’re starting to build stuff that is trying to use our system as a platform, but is having to use the entire onboarding workflow in ways that we don’t necessarily think is optimal. So this is basically part of LiveRamp’s journey, which is turning from a product into a platform. And I’ve subtitled it, Unlocking the Full Value of Identity Resolution.
Andrew McVeigh: So we want to turn into a platform where people can build apps on top of us in a way where internally and externally we offer the same power. And everyone is used to what the notion of a platform is these days, like iOS is a platform, for instance, Android is a platform. People build applications on top of that platform. They use primitives in the platform – they combine them in ways that the original platform owners didn’t necessarily expect, but they’re producing a lot of different applications. It’s a form of distributed innovation that allows for a very, very powerful ecosystem. And so using this approach, you can unlock a lot more value for yourself internally and also customers externally.
Andrew McVeigh: So again, going back to the schematic, we’ve got the onboarding flow and it’s in this orange in this particular case, going for all the way from ingestion through to distribution. Bear in mind, the data scale we operate at is highly optimized, tweaked within an inch of its life, because every time we have an inefficiency it’ll cost us lots of money. So the question is how do we cut it up into smaller pieces such that applications that sit above us can combine those pieces in different ways and insert them into parts of their workflow. So the question for us and the challenge for us is, “How do we combine these? How do we break them up in a way where we don’t completely destroy the efficiency of the system?”
Andrew McVeigh: This very complicated diagram here is part of the data access system. It’s basically looking at a high level at some of the operations that occur. And you can imagine with this level of complexity and this level of tuning, we have to be very, very careful about where we make those cuts so that we expose those primitives. And so essentially we’ve got a tension between modularity or efficiency. And that’s necessarily the case. Imagine a very complex pipeline processing thousands of terabytes of data. You suddenly decide to put a particular break in a particular place so that you can bind primitives A and B. You’re passing a lot more data between that boundary now. You can’t make necessary assumptions about what format the data is in or how it’s going to be passed in and we call this a composition problem.
Andrew McVeigh: Now we’ve always been aware, or for a long time, we’ve been aware that we were a platform already. Even companies that we’ve bought, we bought a company called Pacific Data Partners… Is that the right name? They basically built a B2B system on top of us. They’re running an office out of Seattle and basically rather than doing people-based marketing, they’re doing business-based marketing, so that if they know that the company demographic is that you’re in the market for network storage solutions for instance, a brand will be able to target a particular company and they can target the entire business. So that when for instance, someone at that company is browsing the net, they’ll see a network storage solution and so it’s a very powerful application of our identity resolution. But they have to build it on top of the existing onboarding flow at the moment, which makes it a bit clunky.
Andrew McVeigh: Now we’ve similarly got addressable and connected TV built on top of our platform and vendor marketing. We’ve got a number of applications already using our system as a platform. So that’s one thread which is… we know we need to make a platform. Another very strong thread through the company was some of these platforms have built very advanced capabilities. For instance, the addressable and connected TV – the vendor marketing side – use exact counts for segment overlaps. That exact counting happens through a bitmap where each segment is turned to a bitmap against a universal set of identity links for that particular instance. And then they do a logical and operation on two bitmaps to form the intersection. And so we call that exact counting. Now the platform underneath only offers approximate counting because it deals with massive scale. The bitmaps can’t deal with that size of the scale.
Andrew McVeigh: So the initial approach we had – and we call this unification by the way – so we’ve got platformization and unification. The initial approach we did was to look at bringing exact counting directly into the platform in a unification approach. Now we bring an exact counting directly into the platform. We ended up with a super set of both facilities. What could be better? Now, as it turned out, as we’re starting to do this, we realized that the deployment strategies of the exact counting systems are different, the vendor marketing systems, et cetera, et cetera. We realized we’re doing a lot more work than we needed to do. So we took a step back and said, “Is unification what we’re looking for? And how can we look at the synergies between flat formalization and unification?” As we thought about it, we realized that unification can be used as a powerful way to form a platform.
Andrew McVeigh: Imagine if we can build an application as powerful as our vendor marketing above our platform in a way which exposes the APIs. And so this is our key insight that unification and platformization are exactly the same thing. So in this particular case, you can see the platform, the blue down and the little I, E, S, D and D2, are little API surfaces we’re starting to expose above the platform. So the application is sitting above this and what happens is we have an I-primitive where you can use an API to ingest direct segments into the data access side and that’s where the canonical reference of each segment is kept. The segment at this point can only be involved in estimated counts for overlaps, but what happens is you’ve got a project and extract primitive, the E-primitive where you can pull it out into the application space and construct a bitmap representation of that segment and keep it in sync. That’s giving you the exact count. So you can have the exact counts if you need them, or you can fall back down if the data volume is too large to the inexact counting.
Andrew McVeigh: Then we’ve got the S-segment, the S-primitive for segment overlaps and combinations. The D-primitive for delivery. Now because the segments always living in the data access, the canonical version is always living in the data access system delivery is very easy. And then we’ve got the direct-to-distribution API, D2, which is basically if you already know the identifiers for the destination, you can send your segment indirectly.
Andrew McVeigh: Now just to give you an idea of the power and the efficiency you can get from breaking up the workload to smaller parts, we did a proof of concept on the direct-to-dist API – and Davin in his presentation is going to talk about this in a couple of talks – and that reduced the delivery time for certain connected TV use cases from 24 hours down to one and a half hours. So again, breaking it up into little primitives means that applications have a lot more power and a lot more granularity to be able to customize it to their workflows.
Andrew McVeigh: We’ve also known that our configurations are complex for a long, long time. We’ve basically pushed a lot of business logic into our systems, which comes from us incorporating every particular edge case from every particular application and loading it into the onboarding side, and that complexity we’re aiming to push some of that out by enabling systems to configure the platform in the way that they want to, and so that’s going to be a big focus for us. Certainly we end up doing a lot of manual work on configurations that we would like to avoid in the future.
Andrew McVeigh: Now, as I mentioned, we’re going to have a presentation, an excellent presentation, on APIs in a few more talks time. But I’ll just talk briefly on how we use API. So I’ve developed a lot of APIs over the years. Hulu had 800 microservices for instance, and every one of those had an API. I built a lot of the infrastructure for APIs at Riot Games. We’ve taken a fairly modern approach here, whereas at Riot we used the code first approach, where you construct the code and then you can ask it for its API. We instead… we focus on Swagger which is a textual format for describing rest APIs. Swagger’s also known as OpenAPI 3 and it’s very, very powerful. So we start with a Swagger and from Swagger we can generate code directly from it, which saves an enormous amount of time. We can also generate API docs using Swagger UI or ReDoc and it makes sure it’s completely aligned with the code. So Swagger’s a very good place to start.
Andrew McVeigh: But layered on top of that, we’ve got a set of maybe 30 API standards, and those standards cover things like pagination, what resources look like, the versioning strategy – all of this stuff that we know we need to deal with – and we put those in a document we call API 3, RSC API 3. Now those API standards also predicated on the fact we want a very purest resource oriented approach for our APIs. So we expect people when they make Swagger to use a resource oriented approach and follow all our standards. That’s proven to be quite a high burden. It takes a long time to write Swagger this way. So we’ve also written a small domain specific language which basically ensures that you are pushed into a resource oriented approach, a resource centric approach, and it automatically conforms to all of the API standards. We call that Reslang.
Andrew McVeigh: And this is just one example of… This is the direct-to-dist API written in Reslang. It’s not all of it because there’s obviously lots of supporting structures, but you can see at the top that there’s a request resource called distribution request, which will mean it’ll be /V1/DistributionRequest as a route. You can see right down on the bottom that there’s GET POST and MULTI GET. You can get your distribution request; you can create one and you can also retrieve them. And all of these structures internally allow you to define something which is always 100% consistent with our standards. And we put this through the Reslang interpreter and the ReDoc will look exactly like this.
Andrew McVeigh: So just a quick summary of what I’ve talked about. We’re creating a platform out of our core product. We’re trying to unlock the full value of the underlying identity resolution features. Now some of our challenges are modularity versus performance; where do we cut? And we’re using the applications already built on our platform and the applications that other people want to build to work out where exactly to cut the system so that we don’t do unnecessary work in breaking up that very finely tuned pipeline. And now our goal is to let a thousand applications bloom.
Andrew McVeigh: So we’ve got about three minutes left if there’s any questions? Yeah, please.
Audience Question: What’s the lifetime of your data from the moment of ingestion. What’s your data storage approach?
Andrew McVeigh: Well, for a segment… Sorry. Okay. Thanks, Sean.
Andrew McVeigh: “What’s the lifetime of our data in our system?” For a segment a customer pushes in, it’s completely up to them. It’s their data. It’s sitting in our platform under management. I mean it’s up to them what they do. For data that comes in on the online side, it’s much more ephemeral, because it starts being very low value after around 60 days. So we have effectively a sliding window where we get rid of data that is deemed to be of low value via data science techniques.
Audience Question: Are we building the APIs for ourselves internally or for external partners?
Andrew McVeigh: Yeah, that’s a really good question. So the question is, “Are we building the APIs for ourselves internally or for external partners?” And the answer is both. We want to give people externally the same power that we give ourselves internally. It keeps the platform honest, and that’s the definition in a way of our platform. We’re not there yet. We’re using a lot of internal stuff to dog food it, but we figure if, if we dog food in a way where we are trying to give exactly the same power, then we’ll end up with a very powerful platform indeed.
Andrew McVeigh: We’ve probably got time for one more question. That’s fine. Come and see me after the break or something if you want to talk more. Thank you.

Interested in more content from RampUp?

Clicking on the links below (to be posted and updated on an ongoing basis) will take you to the individual posts for each of the sessions where you can watch videos, read the full transcript of each session, as well as download the slides presented.

RampUp for Developers’ inaugural run was a great success and was well attended by a variety of attendees. Many interactions and open discussions were spurred from the conference tracks and discussions, and we are looking forward to making a greater impact with engineers and developers at future events, including during our RampUp on the Road series (which take place throughout the year virtually and at a variety of locations), as well during next year’s RampUp 2021 in San Francisco. If you are interested in more information or would like to get involved as a sponsor or speaker at a future event, please reach out to Randall Grilli, Tech Evangelist at LiveRamp, by email: [email protected].