[Music]
foreign
[Music]
welcome to YC Tech talks machine
learning
I'm Paige from our work at a startup
team the team that helps people get jobs
at YC startups
so for tonight's event we have Founders
who are going to be talking about
interesting problems that they deal with
in the machine learning space and a
little bit about the problems that keep
them up at night
hi everyone hey I'm I'm Andrew Yates I'm
CEO and founder of promoted.ai I'm going
to be talking about composing models how
to stack recommendation models to always
win
so what is promoted do
we are ranking search and promoting the
best listings at the top to increase
Revenue you're familiar with this
problem if you've ever used Facebook
news feed or Google search or Airbnb or
any kind of hey there's a list of things
and you want to sort it and there's some
sort of objective Amazon
this is a very very common machine
learning task of what should I show you
at this time that you are most
interested in and then we can auction
this off
and make money on this oh if I can show
you these things and maybe you are 10th
but if you want to be first here's how
much you would pay for that we run an
auction and then you have an ads
business
so
in our business
um in in our machine learning task
there are many many machine learning
models and techniques it is a very very
well established problem there are many
different Alternatives there are vendors
there are in-house teams they have
different strategies techniques et
cetera et cetera so
um another challenge is there is
frequently a trade-off between some
models may be good at something some
models better than others so an example
would be
um some models are trained on past
engagement so they're very good for
established items but what about new
items that have no engagement well those
do poorly then you maybe need some sort
of content understanding model content
understanding model doesn't do well on
things we have a tremendous amount of
user experience another example would be
hey we have a model that is very fast to
update within like the current day for
like ads model for example people are
always creating new campaigns they are
maybe running for a day it needs to
update very quickly versus like this
gigantic recommendations model that has
been trained on years of user preference
says maybe it's some sort of gradient
boosted account or excuse me some sort
of gradient descent type of algorithm
and it's very slowly updating it won't
learn a pattern except for weeks and
weeks and weeks after accumulating data
but you want both of these two things
and then different aspects of the
objective you're trying to accomplish in
your search and feed there may be
relevance versus engagement for example
so the idea of well I want things that
people are going to buy and I also want
things that are relevant to this search
query as people would say so in some
sort of human review so you have many
different types of models and you have
different production constraints you
have data availability and dimensions
inference time uh different trade-offs
and reliability and false tolerance and
so the challenge here is as you as a
machine learning practitioner or as us
promoted what do we do to
always win in a b experimentation how do
we how do we always win and the answer
is
we
uh
you will always lose if you always try
to make a single best model so what we
do is we combine all the models together
of course like a very straightforward
pragmatic solution oh we'll just take
all of the good things and combine them
together and we'll get a better model
and in the worst case it's going to be
as good as the individual signals that
we're composing
I'm going to talk a little bit about how
this is actually done in practice
um there are two techniques for doing
this effectively and they're both they
sound simple but there's actually a lot
of really interesting theory about and
very successful techniques that use one
of these two so first is horizontal
composition this is as simple as
take all of the models and average them
together
and you get usually a better result than
what you've originally started with any
individual model this is the wisdom of
crowds so some extremely powerful
successful models are based on this idea
one is a PID controller
um if you're familiar with real-time
Control Systems which is hey do you want
like the difference now do you want the
integral of the difference or the
derivative of the difference how about
just add them all together and then
average them and that's going to give
you the best controller objective yes it
works really well grading boosted
decision trees which decision tree is
best I know let's just take a lot of
decision trees and average the result
together and that's going to give us the
best possible model this is a very very
powerful model very difficult to
outperform even very sophisticated
techniques
um another advantage of this horizontal
composition is it's simple it's
effective it's easy to understand
um some disadvantages are that it is
hard to tune it itself is a type of
model so if you've ever had an
internship for example or a job in a
ranking team or a recommendations team
as soon as you have a simple model and
then you're like well why is it average
like can we make a weighted average why
is it a like can we multiply can we put
like some sort of non-linear
transformation on it it can get
difficult to answer those types of
questions of what is the best model on
top of it because it's generally
composed as something that's very easy
to get started with not some sort of
organized composition and uh an
advantage in infrastructure perspective
is for this horizontal composition you
can compute all of these signals in
parallel and
um it's efficient and modular so you
don't depend on any single one of these
mod signals to be computed before you
compute another model inference you can
do all of them in parallel and then
combine them
the other type of model composition is
vertical composition this is when model
outputs are inputs to another so this
can be uh a big disadvantage to this is
this can be very intensive to log
training data so a little bit about
promoted and why promoted is very
successful at doing this is that we are
in some ways a data streaming
infrastructure business where we are
taking every single inference and
logging it with a tremendous amount of
metadata and features so that we can
take
every single model output at every
single inference and log it and train on
top of it this is sometimes infeasible
to do depending on what you're trying to
do in your system another way of
thinking of vertical composition is it's
a a type or a form of feature
engineering this idea of hey I have
these raw signals I need to transform
them in some sort of way before they go
into the main model to be used you can
think of like vertical composition is a
really really big version of feature
engineering
um an advantage of vertical composition
is the composition itself is learned as
part of the model of whatever the
architecture is so unlike in horizontal
composition where you have like an
average it has a weighted average maybe
has some like some other things and it's
kind of ad hoc uh the composition here
is just part of the model you put in the
signal is the model definition to figure
it out for you
um the disadvantage is that computation
is serial so you have to finish
executing all of the signals before you
can start on the next layer of execution
one really interesting thing about doing
this concept of model composition as
opposed to a more typical machine
learning idea of hey we just have the
model the single one great model is it
helps you think in terms of organizing
an entire machine learning engineering
organization so if you are running
Pinterest or you are running
um Facebook or Airbnb you don't have a
model it's not like the recommendation
model and you don't have like the
recommendation engineer who built it you
have many many different teams doing
different pieces of this and all working
in parallel over time and evolving over
time to build the final and deliver the
final product and so how do you think of
not only models as pieces among other
models but then how do you map that into
an entire engineering organization so
that all of these pieces work together
literally in the computer but all also
as organizations and people working
together so one way that this is done is
to separate meaning from implementation
so different models at different
characteristics as an example
um a click prediction model means this
is the probability of a click on the
specific item in the specific location
that means that you can change the how
that model is implemented from let's say
a grading piece of decision tree to like
some sort of neural network and it
doesn't change fundamentally what that
signal is meant to be it can be then
used in another system that says okay my
feature is the probability of Click for
this item well it doesn't matter how
that was computed it just matters that
it's the same interface so you can start
thinking of models as having an abstract
interface in the same way that you can
build
other types of software and then the
same way that you can do other kind of
concepts of software organization like
um
uh like microservices and one example of
this is in ads systems so this idea of
analytically versus approximately
example and ads is that in and this is
something that promoted is doing is that
um you can separate the price like
what's the optimal price someone should
bid from the um other aspects of the
rest of the system like what's the
probability of clicking your conversion
or like other objectives around user
experience versus ad Revenue
that's my
few minute talk uh I'd love to take any
questions
um how do you pick your models within
your compositions and have you run into
a scenario where either a single model
or a few models have negatively skewed
your results
oh great questions
um for picking models it's it's
engineering the challenge is you
generally as a engineer a practitioner
working at a job it's not a research
problem you're not developing new
techniques or modeling your job is to
accomplish some sort of task for a
system objective so
um it depends like that's the short
answer longer answers are choose models
that work reasonably well and then
choose to have a more complicated model
depending on if it's worthwhile to spend
the resources and energy to do so
the their whole discipline around that
I'll I'll come back so like start with a
linear Model start with like a hand
tuned Rule and then eventually invest in
is it worthwhile for doing the
additional complexity the other part of
is it possible to overdo this yes
yes it's like
um if you follow the Elon musk's Twitter
feed though the microservices video that
you recently posted
um models are the same way like you can
construct a horrible kafka-esque type of
world of models going to models all over
the place which could all just be
condensed down to something that's
relatively straightforward that is
um sometimes more of an organizational
like human organizational problem as
opposed to like in engineering or
technical problem
um from the engineering and Technical
problem sure this is like not every
single is going to add additional value
and you may not have the model
complexity or the training data that
match your your domain and so just
simply increasing the complexity or
signals may not actually improve your
objective how has your experience been
with developers who might not be very
familiar with uh you know certain ml
Concepts or even the fundamental Basics
interacting with these Technologies
this is where the numerical interfaces
are important this concept of this is
what this model means and this is what
it's supposed to be used for
if you don't have a con like you don't
have to understand how and you shouldn't
and you won't understand how let's say
this is the probability of a click for
example like it just is it's probability
of a click but you can still know oh it
means the probability of a click like
you can understand what it means so
that's in contrast to some other types
of Black Box models where you can't
understand what the score means
unless you have the entire final
composition so an example of a model
like this is a learn to rank model in
this domain is the score that is
produced only matters in the context of
other scores in the same result and so
what I've seen that like I think most
intelligent software Engineers which are
I mean all all software into digital
intelligent what what I have seen is
people don't have a problem with like
the more complicated
theory part of it because I recognize
that that's complicated they don't try
to understand it where I've seen people
have trouble if they have less
experience is
using models without understanding what
they're meant to do and then like
running into a mess and then they try to
fix it with a B experiments like oh well
it's a mess with no one can understand
it let's just run an A B experiment
that's where I see people burn a
tremendous amount of time and energy
versus like just simply not
understanding how a grading descent
works or something like that in practice
you don't really need to know that
someone who's simply consuming a signal
from some other some other some some
other system
so with that
um I'm gonna turn it over to Josh
well if you are interested in learning
about you know deep learning and how it
works and all that stuff and not using
it for anything then maybe this
presentation will be a little and we'll
be up your alley so
how and why we created one of the
fastest 3D reinforcement learning
simulators so what we built is a
reinforcement learning environment
called Avalon and actually just
presented it at nerups the machine
learning conference this morning it's
open source it's free anyone can
download and play around with it it's a
procedurally generated uh set of Worlds
there's an infinite number of Worlds of
different tasks there's buildings and
predators and tools and it's sort of a
game like Minecraft in which
reinforcement learning agents can learn
to interact with the world so why did we
build this uh the reason that we built
this is that our goal or generally
intelligent is to make more intelligent
software agents and why do we want to do
that well I mean we want to automate
boring tasks we want to cure diseases
there's all sorts of really cool things
that we could do if we had very
intelligent software today we have some
pretty cool machine learning stuff you
know we can learn to rank things we've
got things like chat GPT but it's still
pretty far from AGI uh here's a good
example that I pulled just this morning
from chat GPS she uh you know someone
asked it like what's the what is the
fastest Marine Mammal it says the
fastest marine mammal is a peregrine
falcon Falcon is not a Marine Mammal
yeah so okay maybe it's a sailfish
that's not a mammal right so it just
goes on it's like it doesn't really know
things necessarily in another sense it's
very powerful very interesting
um it can definitely do lots of really
cool stuff but they're still pretty far
from AGI even Sam Allman uh previously
at one combinator uh is kind of a greedy
is that like there's a lot of people
that think oh this is the AGI though he
realizes this is not obviously very
close yet so you know they are very
powerful though right AI when we apply
it to a particular task like ranking
things or playing go or a Dota or
something like that they do extremely
well and often much better than people
so isn't this kind of a contradiction uh
the real problem though is that we want
general intelligence a really good
definition from Shane like at deepmind
uh via the definition of intelligence
page on Wikipedia is that intelligence
is the ability to achieve many goals in
a wide range of environments so really
what we want is the way to construct and
evaluate a wide range of problems and
environments in other words a simulator
we actually did a lot of sort of
customer research but for researchers as
we were developing this and what we
heard over and over again is one of the
biggest things holding back the
reinforcement learning field is a lack
of really good benchmarks a lot of
people work on like you know Minecraft
or Atari or these other games but they
really are capped in kind of their
ability for us to build really
interesting agents in them and so Avalon
is built from the ground up as a Sim
emulator made specifically for
reinforcement learning so most systems
use existing games like Atari or Unity
or Minecraft and those have made
trade-offs to make games that are very
different for what you want from a
reinforcement learning simulator
so for example in a game you want stuff
to be fun right but that's not actually
what you want to reinforce and learning
simulator instead you want it to be
similar to tasks that people do every
day which are often kind of grindy and
not very fun uh you in a game wanted to
run you know 30 to 60 frames per second
in a simulator you want to run a
thousand frames a second or ten thousand
you want it to be profitable as a game
here we want to be free and open source
so people can do research on it you want
a game to have lots of features we
instead we want this to be really
debuggable and simple you want the game
to be challenging for adults here we
want it to be a range of challenge
that's some of the things are very easy
so that the Asian King got to get
started and some of the things are very
challenging as well in a wide spectrum
of things between nodes so what we did
is we built Avalon on top of the Godot
game engine which is actually a really
cool game engine so it's completely open
source it has physics and rendering and
everything inside of it it's
cross-platform it supports VR
it is about a 30 megabyte download a
single executable it's really really
easy to use it has lots of tutorials and
active Community as a really good
debugger and editor it's a great base to
build welcome it's also nice and simple
and so what we did is we packed it up
basically into a crazy fast simulator so
it was simple enough that we could just
sort of reorder things to make it into a
deterministic actual simulator where we
can say like step wait for the agent
step wait for the agent and which is
very different from our game is normally
played we also then created our own
custom egl rendering back end to avoid
having an X server at all because we
want to run these on you know headless
Linux machines with lots of gpus in the
cloud and this avoids some extra frame
buffering and passing and things like
that we also tweaked it so that the
physics happen much less frequently than
they normally would in a normal game
they'd have in maybe 30 or 60 or maybe
120 times per second but here we update
physics just 10 times per second per
simulated second and that allows us to
do a lot less physics computation
although at the cost of some kind of
funny physics bugs of if you don't tune
in properly Things fall to the floor or
go through each other or do all sorts of
weird stuff so it took a while to get
down to this like very minimal amount of
physics work
we also did a lot of work uh tracing
through the opengl rendering with the
Nvidia profiler until we trimmed out
everything like transparency don't need
that textures don't need that mittmap
don't need that Shadows met uh you can
turn that back on if you want more
visual realism but we did this so we can
get to about 10 000 frames a second
actually eventually uh the Geo clear
call which is just like clear the screen
is a significant fraction of our
rendering time
uh another thing we did was transfer the
data in a very fast serialized way uh
via shared memory using numpy uh and
then you know profile or Gameloft can
move stuff to C plus plus for
necessarily we also wrote Our Own um
reinforcement learning worker rollout
logic to work around the fact that you
know this is a much more complex
environment it takes a little while to
reset when the agent dies and we want to
you know change to a new world and a
whole lot of other stuff uh basically we
did all these things and ended up
getting to about maybe seven to ten
thousand frames per second on a single
GPU which is which is pretty impressive
and we're hoping to get to something
that where even for a single agent it'll
be about 100 times faster than real time
so you can train a one-year-old in maybe
three and a half days
there's a lot of interesting future work
to be done uh one of the things we'll be
doing is moving the Godot game process
into a library rather than a standard
loan process which will let us do batch
rendering we'll also be doing a little
bit more with multi-threading so we can
work on multiple kind of agents at the
same time uh in in a single simulation
uh work moving some of the parallelism
out of the Python level into the sea
level within python to avoid the global
interpreter lock uh also maybe moving to
asynchronous execution of environments
and agents so that rather than
everything waiting around for this
lowest agent if an agent is too slow it
just sort of Misses its turn just like
in real life if you're too slow stuff
happens uh we'll also be doing a lot of
performance Improvement both on existing
networks and on other large language
models and agents in order to make them
actually fast enough to integrate and
put together so there's a lot of really
cool work if any of this stuff sounds
interesting or if doing you know machine
learning work and research and in
general sounds interesting we're
definitely hiring so feel free to reach
out to me to meet here and thanks for
listening
do clients use this as an API to build
their own environments
so it's free and open source so people
are welcome to use it for whatever they
want we're hoping that academic
researchers will use it primarily it's
it can run on a single GPU pretty pretty
easily so it's accessible for most
academic Labs if people want to use it
for business they're welcome to although
it is currently GPL so we you know kind
of require that you contribute back any
fixes or changes that you make
yeah
um another question here can it be used
to generate a digital twin of
small-scale agricultural Farms with 3D
simulated plants for various types of AG
work you could try doing that we have
very purposely stayed away from making
it ultra realistic like the 10 you know
physics ticks per second type of thing
and the visuals you saw before are
purposely not very realistic because
we're really going for speed it's really
meant as a scientific tool for asking
questions about how can we make agents
learn and a little bit less for how do
we make agents that we could then
transfer to the real world we might do
that in future work uh kind of extend
things to be more realistic but that's
not our Focus right now and you had
started to answer this but what are some
of the most common use cases
yeah so right now uh it's primarily
intended as a research tool as I was
saying so there are some researchers
that are working on kind of extending
this in various ways we are extending it
to add a bunch more tasks not just the
sort of simple physical tasks like
running and jumping and throwing that
are in there right now but adding more
linguistic tasks or a sort of unbounded
task one of the things that we really
want to work towards is making
effectively a benchmark for general
intelligence which would have thousands
or tens of thousands of tests so that's
the thing we're working towards other
people uh might be able to use it for in
the future more multi-agent
considerations or some people are
looking at how you can reuse computation
from previous RL agents or one of the
things that this opens up is the ability
to train agents for much longer than
agents have been able to train before so
you can ask how do you learn things
within your lifetime so it opens up a
bunch of new possible research questions
so my question is um you know do you
think this sort of like uh training
agents to act in this sort of simulated
virtual environment is going to be uh
the most important or useful application
of RL in the next few years or will it
you know be more of things like you know
reinforcement learning from Human
feedback for like aligning large
language models um code generation
models things like that just yeah
curious how you think about the
evolution of RL in the next few years
well yeah I guess so RL from Human
feedback is an application of
reinforcement learning two other things
and the purpose of this tool is to make
it so that we can make agents that are
better able to do reinforce and learning
in the first place so anything that we
discover using this tool sort of applies
to those types of applications so this
is sort of like one meta level removed
from that that if we can make agents
that can learn much better from less
data or learn much more quickly we can
apply that to tons of different possible
applications cool makes sense thanks
okay so we'll have a couple of pitches
now from other companies
um first up we have Jay
cool uh yeah so hi everyone uh I'm Jay
I'm a co-founder at eventual we are the
data warehouse for complex data sorry so
this is data like images audio video
documents things that don't really fit
in the table like a SQL table uh our
product is open sourced and it's called
Daft eaft you know it's often said that
data is the most important part of
machine learning right and Daft is kind
of the data engine that's a core part of
the infrastructure Daft is a distributed
python data frame Library so if you've
ever used pandas or Pi spark before
you'll be right at home with Daft it's
built for python so you can use all of
your python functions classes objects uh
making it super easy to work with any
custom complex data types you have like
images or crazy dicom dicom file formats
for healthcare that's distributed we
built it to run on Ray which is a
distributed computing framework so we
can process petabytes of data on
hundreds of machines and we're going
really quickly we did a soft launch with
no real marketing just wrote a month in
about two months we have about 400 Stars
uh we're doing a bigger beta launch at
the end of the year I'm working on
really core technical challenges like
query optimization distributed systems
data the data engineering our front end
is in Python and The backend is now in
Rust and yep learn more about
daft.daft.io and come talk to me thanks
hey everybody
um I'm Stan I'm one of the co-founders
over at hyper globe
um here at hyper glue we're building a
platform that extracts insights uh from
text you find in apps like slack or Zoom
or zendesk or customer facing forums
like Reddit
um our customers leverage hyper gluta
generate real-time analytics on all
sorts of things so like uh what are my
Gamers saying about my latest DLC on
Reddit
um what are our common questions or
competitors that are coming up in my
sales calls
um or what are the top issues in my
support tickets this week
um the best way to think of it is uh as
a center of excellence if you will for
unstructured data so
um we provide the why for why your
numbers are moving right the Aus are up
or down or churn is up or down why is
that well it's uh the hints are probably
you know your customer touch points so
um we want to make it really easy for
you to have that visibility across the
org
um
yeah the founding team Paris Stamper
guys who have nothing better to do than
uh work with ML apparently
um before this we
um use these language tools primarily in
National Security
I'm doing things like media monitoring
or tracking terrorists
around the world
and uh
yeah it's a really cool technical
problem and we're
happy to connect with anybody interested
in the MLP space or
um how it translates to real world
commercial applications and happy to
answer any questions
any specific uh types of engineers that
you're looking to hire
uh yeah so we we are hiring on the
platform side the ml side and the UI
side so we're we're uh just about to
start growing up hopefully you know
um so happy to
um like I said connect with anybody who
feels like uh the problem space is
something that that they'd be interested
in
okay uh my name is Ben Coleman I'm the
co-founder and CEO of reality Defender
we do real-time deep fake detection for
platforms Banks streaming social media
adult entertainment
just to kind of dive in what that
actually means
a arms race is incredibly expanding and
imbalanced there's over a hundred
thousand defect models that's three
percent focus on detecting depicts all
the rest are all the cool gender of AI
you guys are seeing every day
what we do is provide an ensemble
approach to the Deep plate detection
integrating multiple models together and
look for whether it's examples of known
models or unknown models of known
signatures we create our own deep fix in
the lab
we have a no code web app we have a API
we also have a passive internet scale
scanner
we find all kinds of dashboards and
exports and report cards and email
alerts
the possibilities are endless Because
deep fix attack every single entry
vertical everything you do Everything
You Touch everything that looks like you
whether it's a bank whether it's an
insurance whether it's a streaming
platform social media government group
going very fast here it's a very very
obvious use cases fake faces fake
accounts social media shopping Banks
fake housing fake Interiors
with two minutes of Page's voice I
create a perfect deep page we won't do
it on this call
and any of us can be depict online fake
people at real companies real videos to
fake voices or vice versa
we have an API
platform is super simple to log into
using a password
drag it up a file get immediate results
across multiple models relevant to the
type of media the type of codec type of
compression all kinds of exports
I'll leave it at that it's a fun scary
company a scary problem uh and very
amazing company and we're recruiting
here across number of areas across
research engineering data science and
strategy and operations
[Music]