SDS PODCAST EPISODE 333: BERT AND NLP IN …...Kirill Eremenko: By the way, in this podcast with Sinan, you will hear at the end how he is actually benefiting a lot from newsletter

SDS PODCAST

EPISODE 333:

BERT AND NLP IN

2020 AND BEYOND

http://www.superdatascience.com/333

Kirill Eremenko: This is episode number 333 with Director of Data

Science, Sinan Ozdemir.

Kirill Eremenko: Welcome to the SuperDataScience Podcast. My name

is Kirill Eremenko, Data Science Coach and Lifestyle

Entrepreneur. Each week, we bring you inspiring

people and ideas to help you build your successful

career in data science. Thanks for being here today

and now, let's make the complex simple.

Kirill Eremenko: Hey, everybody. This episode is brought to you by Data

Science Insider, our very own newsletter, which comes

in your inbox every Friday. So, I'll make this one brief.

Basically, you go to superdatascience.com/dsi and

sign up for an absolutely free amazing newsletter,

which is curated by our team. We look at the top

developments in artificial intelligence, machine

learning, data science, and other exponential

technologies which are relevant to us as data

scientists. So, we look at the top five developments in

the past week. We put them together, put some

images, put a short description for each one, send

them out in an email, put the link, which takes you

straight directly to the source if you want to read

further. That way, you can stay up to date with what

exactly is happening in the world of data science and

artificial intelligence.

Kirill Eremenko: So, once again, the link is superdatascience one word

dot com slash DSI, which stands for Data Science

Insider. So, head on over there and sign up today, and

start receiving your updates on technology that is

relevant to your career already this week.


Kirill Eremenko: By the way, in this podcast with Sinan, you will hear

at the end how he is actually benefiting a lot from

newsletter updates, and we talk about the ones that

are our favorites. So, this one can be your favorite as

well. On that note, let's get straight into the episode.

Kirill Eremenko: Welcome back to the SuperDataScience Podcast, ladies

and gentlemen. Super, super pumped and excited to

have you on today's episode. We got off the phone with

Sinan a couple of hours ago, and what a blast. I am so

excited to bring you today's episode, and I'm really

happy that it's got the magic number of 333 because it

totally deserves that number, totally deserves to be

special and unique. The conversation was extremely

insightful.

Kirill Eremenko: So, straight to the point, here's what you will hear on

today's episode. You will hear how Sinan's company,

Kylie.ai, was acquired. If you've been with the podcast

for long enough, you'll remember that Sinan has

appeared already twice on the podcast. Most recently,

he told us about his new startup, Kylie.ai, and the

magic they were doing in the space of natural language

processing. Recently in 2019, they were actually

acquired by Directly and with that acquisition, Sinan

has now become the Director of Data Science at

Directly, and he's leading a team there. It's so exciting

what's been happening in his world. You're going to

love it.

Kirill Eremenko: Then we'll talk about explainable AI. We'll talk about

bias in artificial intelligence, and then Sinan will give

us actual examples, case studies of how they're

applying NLP in an AI plus human synergy for


companies such as Airbnb, Microsoft, and others.

Well, he'll actually walk us through two case studies,

one for Airbnb and one for Microsoft.

Kirill Eremenko: Then we'll talk about building company-specific AI

models, and even product-specific AI models, and what

that means for the world of AI. We'll talk more about

acquisition. We'll talk about hiring. By the way, they're

hiring at Directly. They're hiring data scientists and

data engineers. You don't have to be based in the San

Francisco Bay Area. It is preferable, but if you're not

there, you can also apply and you'll learn all the

details about the jobs. That will be somewhere in the

middle of the podcast.

Kirill Eremenko: So, make sure to listen to that if you're looking for a

job in data science or you're interested in this

company, which you will be after this podcast. It

sounds like a very exciting place to work. Then you

can hit Sinan up or apply for their jobs directly.

Kirill Eremenko: Then we talked about sharing deep learning models in

the world. We touched briefly on things like Flask,

Django, Docker containers, Kubernetes, and then we

dove into the world of natural language processing.

This whole second part of the podcast is about natural

language processing.

Kirill Eremenko: You will learn about state-of-the-art NLP frameworks

such as Google's BERT, which has been the top of the

talk for everybody in 2019. You'll also learn about

SQuAD, Word Masking bidirectional, [inaudible

00:04:58] bidirectional, why BERT is bidirectional,

what that means. You'll learn about the transfer


theory, conversational design, and many, many more

topics.

Kirill Eremenko: In a nutshell, epic podcast. Can't wait for you to check

it out. You're going to love it. So, without further ado,

let's dive straight into it, and I bring to you without

further ado nobody else but the legendary Mr. Sinan

Ozdemir.

Kirill Eremenko: Welcome back to SuperDataScience Podcast, ladies

and gentlemen. I'm super excited as you can probably

hear from my voice to welcome for the third time

around the one and only, Sinan Ozdemir. Sinan,

welcome, my friend. How are you doing?

Sinan Ozdemir: I'm doing great. Thanks so much for having me. I'm

always happy to be here.

Kirill Eremenko: It is so cool. I don't know. I just have this amazing

feeling every time I talk to you. You just have some

great energy about you.

Sinan Ozdemir: Well, thank you.

Kirill Eremenko: Yeah. Why? Why do you think that is?

Sinan Ozdemir: Well, that's actually a really interesting point that you

bring up because as listeners of the podcast may or

may not know from my other times on here, I actually

come from academia. I was a lecturer at Johns

Hopkins, where I was teaching computer science and

machine learning. I always knew that my favorite part

about working at Johns Hopkins was actually the

teaching, and more specifically, teaching people who

had no idea what I was talking about, and then

making sure that by the time they were done with the


class, they actually understood and can hold a

conversation about it.

Sinan Ozdemir: I think with data science, especially because it's such

a new field, there's so few people who are majoring in

it, getting to degrees in it, it's really satisfying to talk

about data science because it's a topic that a lot of

people don't understand, and it's a topic that a lot of

people don't really know what the right questions to

ask are.

Sinan Ozdemir: So, every time someone like you or someone else, they

ask me about, "What is data science and how does it

all work?" I get really excited because I get to explain

something that they don't really understand and,

hopefully, they'll walk away understanding it.

Kirill Eremenko: Fantastic. Oh, that's a great way to put it. Now, with

the recent developments, you can explain to people

business stuff as well. You just had your company

acquired. Congratulations on that.

Sinan Ozdemir: Thank you. Thank you very much. Yeah, yeah. My

company Kylie.ai was recently acquired by Directly,

and we're very, very excited about it.

Kirill Eremenko: It feels like yesterday, even though I realized it

probably was one or two years ago when you were on

the podcast the second time, when you were very

passionate on telling us about Kylie.ai and what you're

building there. It's still cool to see the success and

acquisition is a great thing. It means a big company is

recognizing what you're doing, and you now have all

these leverage through the other company to really

impact even more people. Tell us a bit about the


process. How did that all happen? You were building

your business. Did you plan on going through an

acquisition? How did they get in touch? What

happened?

Sinan Ozdemir: Yeah. So, for us, specifically, and this is a mentality

that I hold, as well as my co-founder of Kylie, we never

really went into this hoping for an acquisition. The

plan wasn't to get acquired or something like that. The

plan was to build a solid business. The idea was that if

you build a solid business, and that means going

through understanding your market, understanding

your profit, your expenses, your revenue,

understanding all of that, building a solid business

eventually will lead you down the path that you want.

Sinan Ozdemir: So, our mentality was always, "Let's pretend or let's

work towards that we're going to IPO. What would it

take to become a public company? Let's make all the

right choices along the way. If something happens, if

we catch their attention and they look at us, and they

are interested in acquisition, let's have that

conversation, but let's not make every decision as if

the whole point is to be acquired, because I think

that's when you start making decisions, where you

start favoring things like growth of user base over

growth of revenue, steady growth of revenue. So, we

always make every business decision as if we were

going to be doing this for 20, 30, 40 years.

Kirill Eremenko: Wonderful. Wow! So, how big did you manage to grow

the business? How big was the team? How many

clients did you have? In general, what triggered this

interest from Directly?


Sinan Ozdemir: Yeah. So, the interest was triggered by ... So, at the

time we were hovering around 15-20 people, and we

were servicing some very large like some telcos, and

some retail brands. What really caught Directly's eye

was the fact that what we were offering was full

conversational automation with robotic process

automation or RPA.

Kirill Eremenko: Wow!

Sinan Ozdemir: Yeah. So, Directly is actually in the business of

customer success automation as well with human in

the loop AI. With the acquisition of Kylie, what that

really brought to their business offering was full end-

to-end conversation automation with that automation

of backend processes as well. So, the idea was always

with Kylie. Customer support isn't just a conversation.

It's also about the actions that take place throughout

that conversation.

Kirill Eremenko: Yeah. Sorry. It's like someone putting in the details of

the customer, making some notes along the way,

putting some flags here and there, adding them to a

segmentation, writing up some notes after the call, all

that stuff, right?

Sinan Ozdemir: That, too, but also even during the conversation

looking up information about the user to guide the

conversation. A good example that I like to give is let's

say you're calling in or chatting in to your internet

provider, and you have a question about your past bill.

Maybe your question is, "Why is the bill so high?" or

"Why am I getting charged for this thing?"


Sinan Ozdemir: In the moment, the agent has to look up, first of all,

who are you. Who is this person chatting in? What is

their account status? To answer your question, I have

to be able to see your bill in front of me. So, even

during that conversation, the agent or bot on the other

side has to be able in realtime look up that information

and then use that information to answer the question.

Sinan Ozdemir: That's actually that next generation of conversational

AI that Kylie was offering to its clients. So, it's not just

whenever the poster or the user speaks, how does a

chat bot respond. What we were really about was,

"Well, how do we respond with the context of all the

information that the client has to offer?"

Kirill Eremenko: Very interesting. Putting the conversation into context

on the fly pretty much, right? As soon as they call, you

already have the information up there.

Sinan Ozdemir: Exactly because if someone says, "I have my problem

with my account," and that could be for a number of

reasons. Maybe they're not currently a customer.

Maybe their account was locked because their credit

card was stolen. Maybe their account was shut down

because someone shut it down a month ago. Having

that context really disambiguates a person's question,

and it's actually something that user may not even

know themselves. All they know is, "I can't log in," but

they don't know why. The agent or bot on the other

side can actually find out why and use that

information in the conversation.

Kirill Eremenko: That's really powerful. I can totally see how that adds

business value to all your customers. Can you tell us a


bit more, especially for our listeners who haven't heard

the previous podcast, by the way, if you haven't, we'll

link to that in the show notes, what role does data

science play in this product? Because we can really

gauge how data science works in the product, but if

you could describe in a bit more detail, please.

Sinan Ozdemir: Of course. So, data science, obviously, is a very big

term, and there's a lot of subsets of data science that

go into a product as advanced as conversational AI

with RPA. That really ranges all the way from the more

analytic side, where understanding the client's

conversations and just knowing things as simple as

volume and how volume ships throughout the day,

and what kinds of questions are coming in all the way

to using deep learning and transfer theory to really

understand the natural language coming in and to

generate a response back.

Sinan Ozdemir: So, we really run the spectrum between analytics all

the way to deep learning and transfer theory to make

sure that we are delivering state-of-the-art natural

language processing, generation, and understanding,

and making sure that we have the insights for our

clients to understand what's happening because a lot

of the times when companies are deploying these really

big behemoth deep learning models, they don't often

come with this insights platform as, "Well, how do I

convey what our deep learning model is doing to our

clients? How do we build that trust?"

Sinan Ozdemir: Because these days when someone says, "We're using

AI. We're using deep learning. We're using such and

such," sometimes there are people who will look at


that and say, "Well, hold on a minute. What exactly

are you doing with this data? What exactly are you

doing with this model?"

Sinan Ozdemir: So, we're really trying to build that trust with not only

our clients, but our clients' customers, making sure

that everyone understands how the AI is working and

what controls we do have over the AI systems.

Kirill Eremenko: That's a very interesting question because it has been

in 2019 and I think even more so will be in 2020 and

beyond a central topic AI, explainable AI, and the

implications whether ethical or operational of having

non-explainable AI, and it's really cool always to hear

when a company manages to get one step closer to

explainable AI. Do you mind sharing, of course, if

you're able to share how do you explain or how do you

facilitate that explainability of your deep learning

models?

Sinan Ozdemir: Of course. Yeah, it's really not a secret. I don't think

there's one single way to turn your deep learning

models into an interpretable system. By the way,

interpretability is one of my core tenets of data science

because it's really important to know how other people

are using data around you.

Sinan Ozdemir: So, as far as deep learning goes, I think there is this

gray area where data scientist can simply say, "Oh, it's

just a bunch of matrices being multiplied together.

There's really no way to know what's happening." I

think that's an excuse that data scientists can use to

say, "Well, I built this great model. I don't really have


to explain how it works," because that's too difficult. I

think that's not really the right way to approach it.

Sinan Ozdemir: I think even for deep learning, you have an

opportunity to say, "Well, here are the inputs. Here are

the outputs. Here's the training data. Here's where we

got the training data. Here's what we did to the

training data to make it more readable to the machine

learning model." There's so many steps in between not

having deep learning, and having deep learning that

you can explain along the way.

Sinan Ozdemir: Something as simple as where does the data come

from can answer sometimes a majority of our clients'

questions because even that is a mystery to some

people is, "Well, where does all of this data come from?

How do you learn all of this? Where does that

information and insight derived from?"

Sinan Ozdemir: So, it's not always just about how does the model

work, but sometimes it's as simple as, "Well, where

does the data even come from, and then what do you

do with it, and then how do you feed it back in to the

system to update later on?"

Kirill Eremenko: Yeah, and does the data have bias, which to your

point, what source it comes from. Maybe you're

originally getting the data from a source that has bias

inherent in it, and then, therefore, you're training your

models based on bias data.

Sinan Ozdemir: Yeah. Bias was actually a big part of the topic game at

SuperDataScience last year is, how do you know

where the data is coming from? How do you know that


the source is valid? How do you confront those biases

and resolve those biases?

Sinan Ozdemir: A big thing that Directly is actually doing is curating a

network of subject matter experts around the globe to

help understand and resolve those biases in our

clients' data. So, we are really working hard with

humans and AI together to resolve a lot of those biases

that are fed into the models.

Kirill Eremenko: Very interesting. So, I love that. What are these subject

matter experts? Again, if you can share just probably

some sense of information, but to the extent you can

share, what do these subject matter experts do, and

what kind of data they're looking at?

Sinan Ozdemir: Of course. Again, this is not a secret. This is actually

one of the core differentiators that Directly offers as

our expert intelligence platform. What that really

means is we are working with people around the world

who actually have, they have knowledge and a deep

subject matter expertise in our clients' offering.

Sinan Ozdemir: So, a really simple example is one of our bigger clients

is Airbnb. What we actually do is we work with Airbnb

super hosts around the world, people who use Airbnb

daily just on their own, and then we go to these people

and say, "Hey, listen. Here is some data from Airbnb

like an intent matching problem."

Sinan Ozdemir: For example, if a user of Airbnb chats in and says

something like, "I need to rebook because my host

canceled on me," or something like that, we work with

hosts who say, "Well, I've been there. I understand this

problem." So, we're really making sure that our intent


matching, our data labeling, our conversational flows

are being audited and looked at by real people who

understand our clients' offerings, not just Directly

employees, but people who really understand how

Airbnb works.

Kirill Eremenko: Wow! I am listening with all ears. This is so

interesting. You do not only get to work on AI, but you

also work with real people to tailor that. That's

probably living the definition of a synergy between

artificial intelligence and human intelligence. That's so

cool.

Sinan Ozdemir: Absolutely. Yeah. Really, it's one of those things where

I've spent a majority of my data science professional

career teaching about how do we as data scientists

find data, curate data, work with data, process data,

model data that sometimes what gets lost in that mix

is, well, the data comes from somewhere, and usually

that somewhere is humans.

Sinan Ozdemir: So, at Directly, it's really important to us to create that

synergy between humans and AI because if you have

AI without the humans, you start to see that

degradation. Interpretability becomes difficult. It starts

to become unruly. So, working with our expert

network is really what differentiates not just our

business offerings, but our AI as well.

Kirill Eremenko: Fantastic. That was a very clear example with Airbnb

that you gave. Are you able to share another example,

maybe from a different industry?

Sinan Ozdemir: Yeah, of course. What's really easy about that is this

really works across several industries and domains.


So, Airbnb is one, but another one that is pretty fresh

in my mind from the work that I have been doing is

Microsoft. Now, Microsoft, being another client of

Directly-

Kirill Eremenko: That is so cool. Congrats. Such big companies. That is

so exciting.

Sinan Ozdemir: Yeah. Of course. Thank you so much. With Microsoft,

it's actually in some ways a better example because

Microsoft has so many offerings, right? They have

LinkedIn. They have OneDrive. They have all of these

different unique product offerings that all require a

different touch to their customer support, and they all

require a different touch from their AI as well.

Sinan Ozdemir: At Directly, just like at Kylie, we really focus on

specialized company-specific AI models. So, each one

of our clients and each one of their product offerings

can have a very granular level of AI, and a model that's

curated especially for them. So, with Microsoft, for

example, their OneDrive customer support and their

LinkedIn customer support models can be very

different because someone is saying, "I can't log in to

my LinkedIn account," versus "I can't log in to my

OneDrive account," may have very different answers

depending on the type of product that they're using.

Sinan Ozdemir: For LinkedIn, it maybe as simple as, "Here's this

website. Here's how you figure out how to log in." For

OneDrive, it could be more complicated. It could say,

"Well, you're going to have to come back in at this

time, and do this, and do that." So, the answer may

change even though it's all under the big umbrella


company Microsoft. So, it's really important for us to

understand not just at a company level, but at a

product offering level how the AI is going to be different

between them.

Kirill Eremenko: That is crazy. I was just thinking that you're moving

not just from company-specific AI models to product-

specific AI models, you must have a billion people

working in your data science team. Where do you get

the time to build all these models?

Sinan Ozdemir: Well, it's really a factor of understanding that there

isn't going to be some AI model that will work for every

situation, every time of day, for every language. It's

really about understanding what are the best types of

models for different situations. So, you don't need a

billion people to make this work. You need a few really,

really smart individuals, us, like the people on my

team, really, really smart individuals who understand

it's not just about, "How do we build this gigantic deep

learning network that will understand anything at

anytime, anywhere?" It's really about, "Well, how do we

understand our clients' specific needs, and then how

do we deliver AI that is right for our clients?"

Sinan Ozdemir: So, our Airbnb models, and our Microsoft models, and

our Samsung models all might be very different from

one another because they're all trying to answer

different questions.

Kirill Eremenko: Is there a company that you don't work with?

Sinan Ozdemir: There's probably a few out there. We'll get them,

though.


Kirill Eremenko: That's awesome. Well, fantastic. Congrats on that.

Sounds like a very exciting space to be in. Help me

understand, though. So, you build a business where

you are the founder, you're co-founder. There's two of

you. You had 16 people in the team. Then along came

Directly. You agreed to the acquisition, and you stayed

with the business. So, obviously, I guess there's

usually a choice whether you leave, you just sell the

business and you leave or you stay with the business.

Why did you stay with the business and what is your

new role in this company, in Directly?

Sinan Ozdemir: So, you're right. There is usually a choice. There's

usually you say, "Well, I'm done. I'm going to walk

away," or you can stay on with the acquiring company.

The reason I stayed on with the acquiring company,

the reason I stayed with Directly was because their

product offering, their roadmap, their vision for using

AI in customer experiences aligns so much with Kylie

that the acquisition really felt more like a merger,

right?

Sinan Ozdemir: We had product offerings that they were hoping to

build in 2020, and they had this network of experts

that we knew were going to be so beneficial to our AI

models. It just felt like a perfect match.

Sinan Ozdemir: So, really, for me and my co-founder, it really didn't

feel like we have to choose between working with them

or leaving. We really wanted to work with Directly. It

just felt so natural and like a perfect match for our

teams, and for our AI, and for our product offerings.

So, it was really, really nice to have that matchup

together.


Kirill Eremenko: Nice. Did you get to meet the executive team for the

acquisition, like feel for what kind of people they are?

Sinan Ozdemir: Yeah. For anyone out there who is unfamiliar with

acquisitions in the startup space or maybe they're

thinking about doing one of their own, I highly

recommend not just meeting the executive team before

an acquisition, but really get a feel for what it's like to

work there.

Sinan Ozdemir: One of my favorite things that Directly did while they

were looking at us was every quarter, Directly has a

hackathon, where they invite every one of the

company, not just engineers to work on a project or

piece of code or something that they would not get to

work on normally. While we were going through the

acquisition, while we were still in talks, Directly had

their hackathon, but they invited us to their

hackathon.

Kirill Eremenko: Awesome.

Sinan Ozdemir: At first I said, "Well, we don't work there yet. We don't

want to be imposing."

Sinan Ozdemir: They said, "Absolutely not imposing. We want to see

what it looks like to work with you, guys."

Sinan Ozdemir: So, we actually ended up joining hackathon teams and

working with Directly employees to get a sense for

what it would feel like to work together. So, that's

actually one of my favorite things that Directly did. It

wasn't just about, "Okay. How much more revenue will

we get? Okay. How many patents are we going to

receive?"


Sinan Ozdemir: For them, it was really more about, "Well, how do we

work with these people? How does Sinan fit in to our

team?"

Sinan Ozdemir: To answer your other question, I recently come on as

their Director of Data Science. So, my role has really

shifted from how do I build this product, how do I offer

this AI model, this data science platform to the world,

and that shifts to, well, I still do that, but now, I get to

think about, "How do I bring data science to the rest of

the company? How do I democratize machine learning

and AI to a point where anyone at Directly feels

comfortable talking about what our machine learning

models do for our clients and for the market?"

Kirill Eremenko: Well, congrats, first of all, on the huge role. That's

massive at a company-

Sinan Ozdemir: Thank you.

Kirill Eremenko: ... that large and that's working with such great

customers, that's very responsible. Also, the

description of the acquisition, amazing, amazing. I'm

learning so much just by talking about this. One thing

I wanted to understand, first of all, how big is your

team in Directly as the Director of Data Science?

Sinan Ozdemir: Yeah. So, the way that Directly is set up is we have our

data science resources spread out among several of

our teams, our engineering teams. So, on my team, I

have people who are data scientists, machine learning

engineers, but I'm also working extremely closely with

the analytics team. While they may not be directly a

department under data science, they are still, in my


mind, doing exactly the same things a data scientist

would do.

Sinan Ozdemir: So, my team is very broad. I have about probably five

to 10 people at the company, who are in some way

performing data science tasks, and who are actually

doing the analytics and the machine learning behind

the scenes. We're still looking to grow that team. We're

going to be hiring relatively soon for more data

engineers, and more data scientists, and more

machine learning engineers because we're always

trying to make sure that we're staying at the top when

it comes to delivering that state-of-the-art NLP.

Kirill Eremenko: That's amazing. That's really great. That's already a

decent-sized team, five to 10 people. It's exciting to

hear an example. This has come up on the podcast

before. We have companies, which choose to have a

centralized data science team. All the data scientists

sit together, and there's these companies that choose

to have an integrated data science team, where the

data scientists are spread out, but it sounds like yours

is more on the integrated side of things where you

have data science representatives within individual

product areas of the business.

Sinan Ozdemir: It is much more-

Kirill Eremenko: I wanted to ... Yeah?

Sinan Ozdemir: Sorry. It is much more in the integrated side, but at

the same time, we also find a lot of value in working

together. So, one example of that is every two weeks,

we have what's called a journal club. For those of you

who are getting masters or PhDs know what a journal


club is, is every two weeks we decide on an academic

paper that came out in the last year. We all read it, we

all digest it, and we all have to come with examples of

how could we use this paper at Directly.

Sinan Ozdemir: So, we're always reading what's latest and greatest,

and we're always thinking about how do we apply this

to the company. It's really my personal conviction to

say a data scientist is neither just on the business side

or just in the research side or just in the software side.

We have to be able to understand each other's

language.

Sinan Ozdemir: I want people who have PhDs in math thinking about

how could we use this to really ramp up our devops for

machine learning. So, I really want them really cross-

functionally thinking about data science, not just

sticking to "what they do best".

Kirill Eremenko: Amazing. Totally amazing. I think this is a good time to

do a recruiting plug because before the podcast, you

mentioned you will be hiring.

Sinan Ozdemir: That's right.

Kirill Eremenko: You've also mentioned it now. Tell us about it because

we've got 10,000 data scientists listening to this. Let's

get them sending you their resume.

Sinan Ozdemir: Absolutely, yeah. So, Directly is hiring, and for more

things than just data science, but we are hiring for

data science, and for me, that's super exciting because

I always love growing my team, and I love getting

different perspectives about data science. So, what

we're looking for are people who are ready to work with


natural language processing people who are

experienced and people who are ready to get their

hands dirty with the latest and greatest in deep

learning for natural language processing, generation,

and understanding.

Sinan Ozdemir: So, anyone out there who is interested in working with

a really awesome tech startup in San Francisco and is

really excited about working on these really interesting

problems that involve not just conversational AI, but

also in robotic process automation, how do we

automate these backend processes to empower these

automated conversations, we want you. We want you

to check us out.

Kirill Eremenko: Man, you're making it sound so exciting. I want to

work for you. This is so cool, especially the work, NLP,

deep learning, RPA, the cutting edge discussions about

cutting edge technology integrating to the business.

The culture sounds fantastic. Do you guys hire people

remotely or do you have to be based in San Francisco?

Sinan Ozdemir: Thanks for asking that. We don't have a specific policy

on whether you have to be remote or in San Francisco.

We do have remote workers. We do tend to prefer that

people are in the Bay Area, and that's usually just so

that we can start, to your point, to have that culture,

especially at such a young or at a young stage of a

startup. It's really important for us to make sure that

our culture is being build as well as we possibly can,

but we do offer remote positions.


Sinan Ozdemir: So, if you're thinking, "I don't live in San Francisco. I

can't even apply," don't think that way. Please, please

do apply even if you don't live in the Bay Area.

Kirill Eremenko: Fantastic. Indeed. If you're really good at NLP, you're

really passionate about NLP or there's somebody really

passionate about NLP that comes to you and says,

"Hey, I live in Budapest," or somewhere else, why

wouldn't you hire them, right?

Sinan Ozdemir: Yeah, absolutely.

Kirill Eremenko: You're getting a super talented person from the other

side of the world.

Sinan Ozdemir: Absolutely.

Kirill Eremenko: Fantastic. Well, I hope everybody is excited. How do

they get in touch, Sinan, just so we put that onto risk?

Sinan Ozdemir: Of course. So, on the website, directly.com, we do have

a page full of all of our positions that we're hiring for.

Again, like I said, it's not just data science. We are also

always looking for full stack backend, front end

engineers. Even if you have a data science mind, that's

even better, right? I love working with full stack

engineers who have built functioning websites around

machine learning.

Sinan Ozdemir: I think in today's age, that's really important. As more

and more people are switching to containerized

applications and working with things like Amazon is

the last container service or Kubernetes, it's really

important for a data scientist not just to understand

the models, but understand, "How do I take these


models that I am building and deliver them to the

world with high availability and low latency?"

Sinan Ozdemir: So, it's not just machine learning engineers, and

statisticians that we're looking for. I'm really looking

for people who have data skills, and also run the

models themselves.

Kirill Eremenko: Okay. So, people who have data skills, can run the

models themselves, very, very exciting times. Yeah.

What was I going to say? Natural language processing

and RPA combined, what a really cool... This what I

was going to say that people listening, I love this stuff,

people listening to this podcast, just if you're going to

apply, just put in the application that "I heard Sinan

on the SuperDataScience Podcast", and right away,

they know the Director of Data Science. How much

ahead are they compared to other people who apply

who haven't heard you?

Sinan Ozdemir: Right.

Kirill Eremenko: Crazy. Hmm. Another way to get the job is come to

DataScienceGO 2020. Sinan is going to be presenting

there, doing a workshop for advanced practitioners,

and maybe more. So, just meet Sinan at

DataScienceGO, and give him your resume in person.

Sinan Ozdemir: That's right. That's right. No, I love that. People always

say, "I hate being handed resumes. I hate it when

people come up to me when I'm doing something else."

I love it. I mean, I used to be a teacher. I had open

office hours. I love it when people come up to me and

say, "Hey, I heard you on the podcast. I just wanted to


give you my resume," or "I just want to send you my

business card."

Sinan Ozdemir: To me, as someone who is both an entrepreneur, a

data scientist, and a teacher, I love meeting people,

and I love when people show that initiative. I just love

talking to people about what they love to do. So,

please, yes, come up to me and hand me a resume. I

am fine with that.

Kirill Eremenko: Fantastic. Thanks, man. Speaking of DataScienceGO,

first of all, thanks for accepting the invitation. Very

excited.

Sinan Ozdemir: Of course.

Kirill Eremenko: What is it also, and what your workshops are going to

be about or your talk? Any ideas?

Sinan Ozdemir: So, you know what? I have so many ideas, and I'm not

just saying that because I have no ideas, but I really

do. I'll workshop a few of them right now with you.

Maybe you can give me some of your feedback.

Kirill Eremenko: Sounds good.

Sinan Ozdemir: One thing that I really want to start talking about

more is, like I said before, how do you take those

models that you've built? These great models they

have great metrics, they're performing well, and they

work in your Jupiter notebook, maybe, but how do I

deliver that model to the world? How do I put it in a

place where people can use it whenever they need to

use it? How do I build APIs around my models? How

do I build websites around my models?


Sinan Ozdemir: I think that might be something that I'll explore in

teaching people not just how to build those machine

learning models, but how do you actually integrate

them into your systems. How do you actually deliver

them to the world?

Kirill Eremenko: Very cool. I like that.

Sinan Ozdemir: Yeah. I like that. That was really good.

Kirill Eremenko: I like that idea.

Sinan Ozdemir: Yeah.

Kirill Eremenko: Yeah. So, what tools would be involved in that?

Sinan Ozdemir: So, for that really depends on the way we want to take

it, but we'd probably want to learn about some web

frameworks, maybe like a Flask or Django. We'd also

learn about Docker, and Docker containers, and

containers in general. We'd also want to talk about

how to deploy those containers using something like

Kubernetes, and how to get to the cloud.

Sinan Ozdemir: There's this whole pipeline of how do you serve up

machine learning models. To get one step further, once

it's in API, how do you build a website on top of that?

How do you build a Chrome extension that can

actually call that API, so people can use it in realtime?

Sinan Ozdemir: So, I think that's one of the routes I'm thinking of

going. The other route was really just addressing this

renaissance in natural language processing focusing

on things like BERT, transformer architecture, GPT2

and really diving in not just into the inner workings of


these deep learning models, but really, what is their

best used case in the real world?

Sinan Ozdemir: So, I think that's something that sometimes can get

glossed over is, "Wow! There's these great models out

there. There's BERT, there's ELMo, there's all of these

different models coming out, but what do I do with

them? How can I even begin to use such complicated

models when I don't have a PhD in whatever?" I think

that would be something that I would want to really

dive into is, how do you actually use the latest and

greatest in NLP for what sometimes maybe considered

even the simplest of problems.

Sinan Ozdemir: So, I think I'm between the devops, the dockerizing,

the serving up machine learning architecture versus

really diving deep into the latest and greatest in NLP,

and using that to build the next generation of natural

language models.

Kirill Eremenko: Wow! That's such a tough choice. Both are amazing.

Actually, both sounded like huge workshops. Are you

sure you can cover all like Flask, Django, Docker,

Kubernetes in two, three hours?

Sinan Ozdemir: I think we can do it. I think for the people who really

want to learn it, I think we can do it. The reason I

think that is because it all works together so well. I

think that if we need more than three hours to go

through at the very least the high levels of how it all

works with an example and actually building one

ourselves, I think it's not being explained the right

way.


Sinan Ozdemir: I think there really is this kind of natural flow between

building out a model and serving it up in Kubernetes

that I think really can be addressed in two to three

hours.

Kirill Eremenko: Yeah. Amazing, amazing choices. It's going to be really

hard to pick, but I guess we'll find out, and we'll all see

at DataScienceGO 2020. So, now, let's switch gears a

little bit, and talk about natural language processing.

So, BERT, it's been in the air for a while now.

Everybody is talking about BERT. What is BERT?

What is it used for in a nutshell?

Sinan Ozdemir: Yeah. Of course. So, BERT is one of those latest and

greatest NLP models that I was talking about before,

and the papers that originally started talking about

BERT, which, by the way, stands for Bidirectional

Encoder Representations from Transformers, but

that's not really that important. I think the first papers

came out late 2018 about it.

Sinan Ozdemir: What it is, is a language modeling architecture. So,

what that really means is it's a way of pre-training a

deep learning network to take in text, whether it's

English or multilingual text, taking in that language,

and representing the context as a vector.

Sinan Ozdemir: So, if you think about it, and for those of you who are

working with something like a scikit-learn or TF-IDF

vectorization, what those do in a nutshell are not so

different. They're taking in text, and they're outputting

vectors that represent that text.

Sinan Ozdemir: BERT and other architectures like it, especially

transformer architectures, are trying to do something


very similar. They're trying to take in texts and

represent that text as a series of numbers, as a vector,

and those vectors can then be used to train or fine

tune a new model for a downstream task. When I say

downstream task, I mean like a classification problem

or a sentiment analysis or something like that.

Sinan Ozdemir: So, BERT is really a way to take in language, words,

strings, sentences, phrases, and represent those

pieces of texts as a vector because as we all know,

machine learning models don't work well with text.

They work well with vectors. So, that's that mapping

from texts to vectors. It is all important in natural

language processing.

Sinan Ozdemir: So, BERT is one of those latest and greatest that are

able to take in text and output context, output vectors

that have been used to achieve state-of-the-art results

in a bunch of natural language processing tasks like

SQuAD.

Kirill Eremenko: What is SQuAD?

Sinan Ozdemir: SQuAD? Oh, yes. Well, SQuAD is the Stanford

Question and Answering Dataset. So, it's basically a

problem where you use a model to ask a question, and

give it a paragraph that has the answer to that

question in it, and it's up to the deep learning model to

say, "Where in this paragraph is the answer to the

question?"

Sinan Ozdemir: It's one of those datasets that are used pretty widely to

illustrate how well their language modeling works is.

BERT was able to achieve a metric or achieve a


performance on the SQuAD dataset that outperformed

many of the former state-of-the-art models.

Kirill Eremenko: Hmm. Okay. Wow! That's very cool. So, help me

understand, please. Is the main difference between

BERT and other language, because the reason I'm

asking is this idea of vectorizing words has been

around for ages, probably decades, but is my

understanding correct that the main difference

between BERT and the prior existing models is that

rather than vectorizing words, it actually vectorizes

context?

Sinan Ozdemir: So, I think there's a way to think about it where

previous or more simple text vectorization problems

are really looking at the tokens, the words themselves,

and assuming that individual tokens are

representative of the context. So, I think both types of

models are trying to map context correctly.

Sinan Ozdemir: What BERT is doing or one of the things that it's doing

to achieve those state-of-the-art results is it's working

in a bidirectional format, meaning, it's basically

reading the text both left to right and right to left. That

sounds, "What does that mean?" Right? Like, "Who

cares?"

Kirill Eremenko: Yeah. Yeah. Why would you do that?

Sinan Ozdemir: Exactly, but what that really means is that the

language model is trying to understand, "Well, I both

want to know what the words mean, but I also want to

know what the words mean when they're stringed

together one at a time."


Sinan Ozdemir: So, if I were to say, "I like this dog," if I read it from left

to right, as a human, we obviously can understand

that context. If we were to pick out the words and say

like and dog, those words by themselves don't really

mean much to the sentence overall.

Sinan Ozdemir: What BERT is trying to do is say, "Well, let me take in

this phrase. Let me read it left to right and right to left,

and then basic try to understand not only the words

that are being used, but the combinations of words

and the sequence of words.

Kirill Eremenko: So, it tries to understand the sequence of words. So,

reading backwards, it would be dog this like I. How

does that add value to the algorithm reading it

backwards? What additional insights does it get from

that?

Sinan Ozdemir: Of course. So, the way that BERT is trained is really

the key here. So, the way BERT is trained is what's

called Word Masking. Basically, what that means is

our training set, our phrases input text, and we

randomly take out words. The goal of the model is to

say, "Given the rest of the sentence, what word should

be here?"

Sinan Ozdemir: So, I may have said, "I like this blank because it is a

very loyal companion." The goal is to predict the word

dog by using the words to the left and to the right. So,

by reading it in a bidirectional fashion, you

understand, well, if you read it left to right, "I like this

blank," that alone is not going to tell you the answer to

the question.


Sinan Ozdemir: If you read it from right to left, you would say,

"Because it is a loyal companion," and you say, "Oh,

okay. Loyal companion sounds like it's probably a

dog." So, understanding what's to the left and to the

right of that missing word really helps put that one

word into context.

Sinan Ozdemir: So, the professional nature of the architecture helps

the model to understand anything around the words in

the sentence, and the way that the model is trained

with Word Masking really helps the model understand,

"How do I simply pick up words and move them

around without changing the meaning of the sentence

itself. So, it's really both about a bidirectional

architecture and the Word Masking as a training.

Kirill Eremenko: Wow! Blowing my mind here. That is amazing. What

do you do in cases like what you probably experience

at Directly where you have live conversations, where

the text, the whole sentence is not available, so you

can't really read it right to left because the sentence is

not finished. Somebody is still talking or they're still

typing it up or the conversation isn't over? Does it

work there as well?

Sinan Ozdemir: So, I think the way to really think about it is BERT as

a tool is used to model input text for any number of

downstream tasks. So, you could basically take BERT

and say, "I'm going to use a pre-trained BERT. So, at

BERT that was trained on, let's say Wikipedia or

Twitter or some large corpus. I'm going to take that

BERT and then I'm going to train a separate task,

which may be what is the next word in the stream of

thought.


Sinan Ozdemir: That's what really transfer theory is all about. Transfer

learning is about, "How do I take this model, which

has already been trained on one dataset, and then use

that model to train a task for a different dataset?" So,

BERT can be used to do natural language generation

even though BERT itself is not a natural language

generation type model.

Kirill Eremenko: Okay. Wow! That is really insightful. So, that's what

the whole ... What did you call it, though? Oh, the

transfer theory. That's what the transfer theory is all

about. Very interesting. So, do you guys use this at

Directly?

Sinan Ozdemir: Yeah. So, BERT is actually one of the many types of

models that we deploy to our clients. I believe I said

earlier, every single one of our clients' product

offerings have their own unique and different machine

learning and AI model. BERT is just one of those many

options.

Sinan Ozdemir: So, we use BERT and we'd also use some other types

of multilingual transformer architecture. So, BERT is

not the only thing that we are doing at Directly, but it

is one of the state-of-the-art models that we are

offering to our clients.

Kirill Eremenko: Got you. Well, Sinan, completely amazing

conversation. Totally loving it. We're running short on

time, but I do want to ask you one more thing.

Sinan Ozdemir: Of course.

Kirill Eremenko: That is, we're heading into 2020, what is your

prediction for NLP, for the future of NLP in 2020?


Sinan Ozdemir: I think in 2020, what we're going to start seeing, I

mean, we're already seeing it, but what we're going to

see a lot more of is integration of automated

conversations not just in our Alexas, and in our

phones, but we're really going to start to interact with

these automated conversation systems at work.

Sinan Ozdemir: We'll start interacting with them in shopping malls.

We'll start interacting with them in places where we

previously didn't really think we wanted them. That's

going to be both good and bad. I think companies who

are working diligently on curating and creating these

conversational experiences with AI really have to think

about, as I always come to back to you, the context.

Sinan Ozdemir: So, I think we're going to start seeing AI in new places

where concerns about privacy and bias are going to

come up, and it's really up to the data scientists and

the data practitioners to alleviate consumers' troubles

and fears, and really make sure that everyone trusts

and is comfortable with the AI that they're interacting

with.

Kirill Eremenko: Wow! That is so true. I was presenting at a conference

recently for L&D managers and leaders about the

future of artificial intelligence, and just generally what

AI is. One of the top trends was natural language

processing, and then later, a few weeks later, one of

the people that I networked with there, an executive or

an L&D manager or leader, they emailed me asking me

if I knew any good conversational designers.

Kirill Eremenko: I was like, "Is that even a job?"

Sinan Ozdemir: Yeah, it is.


Kirill Eremenko: Now, yeah, I looked into it a bit more, indeed. Yeah.

For all these chat bots, for all these, as you said,

conversational experiences, now, we need design. This

really ties in to the whole point like how people are,

"Oh, AI is taking over jobs. We're losing jobs." Well,

according to the World Economic Forum Research

from 2018, for every job that AI displaces, there'll be

1.7 new jobs that will be created.

Kirill Eremenko: This is a real life example. Conversational designers,

seriously? Those words never came up in that

sequence before three years ago or before that. Now,

it's apparently going to be a big profession.

Sinan Ozdemir: Well, it's funny you mentioned conversation design

because, actually, we were just talking about it at

Directly. The idea of conversation design has been

around for a while. Humans have been helping other

humans have conversations more effectively, but to

your point, now, we're going to start to have humans

who are helping AI have more fluent conversations.

Sinan Ozdemir: So, these jobs are shifting between helping humans

versus AI, but the concept has always really been

around for conversation design. Now, it's going to

become even more relevant as companies create more

and more automated conversational experiences. We

have to make sure that they are fluent, that people are

comfortable with them, that people actually want to

talk to them. If you build an AI that no one wants to

talk to, it could be useless in some sense of the word.

Kirill Eremenko: Exactly. Some companies I call up, and it's just a

terrible experience sitting on the phone, waiting for a


human to reply, this music playing in the background

or even worse an advertisement playing. That's

ridiculous, right? How much better will everybody's life

be when we have AI doing these conversations, and

then humans who are no longer have to service a

million customers and, therefore, there's a huge

waiting line?

Kirill Eremenko: Now, those same humans can train the AI and tailor

those conversations whether it's to Airbnb, to

Microsoft, to Samsung. Whatever company you're

calling about, you're going to have amazing experience.

I think it's a win-win for everyone.

Sinan Ozdemir: Absolutely.

Kirill Eremenko: Fantastic. Well, Sinan, it's been a huge pleasure.

Thank you so much for coming to the show. One

question I do have for you before we go. What's a book

that you can recommend to our listeners that's

impacted your life in the past year or so?

Sinan Ozdemir: A book that I have not written, I assume?

Kirill Eremenko: Yeah. We will make sure to link to all of Sinan's books

in the show notes. He's definitely worth reading, but,

yeah, a book that you have not written.

Sinan Ozdemir: I've been thinking about this, and one thing that I've

been reading a lot of, and it's not a book per se, but

recently, I've started signing up for as many data

science and AI newsletters as I can.

Sinan Ozdemir: Now, that sounds almost like I'm inviting people to

spam my inbox, but what I really get from that is a lot

of different perspectives around AI and data science


that I definitely would not have gotten in my day-to-

day life. Even just today in my Medium newsletter that

I get, the first article was about how to appropriately

do hypothesis testing for machine learning

performance evaluation.

Sinan Ozdemir: I go, "Huh. I've done that before, but I'm curious to see

this person's take on how to do it."

Sinan Ozdemir: Then on a separate newsletter today, I'm reading about

different facial recognition bans in different states, in

different countries, and their reasonings for doing so.

So, I'm getting not only this statistical idea, but I'm

also getting this policy and governmental idea about

AI.

Sinan Ozdemir: So, getting these different perspectives from these

different newsletters I think is sometimes even more

valuable than just reading a single book that has,

potentially, one or a few perspectives.

Kirill Eremenko: Very interesting. So, what's your favorite newsletter so

far?

Sinan Ozdemir: My favorite newsletter, I mean, so far, has actually just

been the daily Medium curation, the Medium articles. I

think I get such a wide variety of people saying, "Here's

BERT from start to finish," and then right under that

is, "Here's what it looks like to spin up a dockerized

container in Kubernetes for machine learning."

Sinan Ozdemir: So, you get this really wide variety of different people

talking about different aspects in data science. I've

really been enjoying the daily Medium newsletter.

Actually, I try to read at least one or two a day.


Kirill Eremenko: Wow! That's crazy. How do you find the time for that?

That is insane.

Sinan Ozdemir: I'm a morning person. So, that helps.

Kirill Eremenko: Okay. Well, it's interesting that you mentioned that

because I actually ... Oh, when was it? Two or three

days ago, I was sitting reading newsletters myself, and

I don't do them everyday. You're a superhuman if you

can do it everyday, but I'm subscribed to two. One is

the Abundance Insider. I think you'll like it. It's not

just about data science and AI. It's about more

technology in the world by Peter Diamandis. It's free,

and it comes once a week on Fridays, and they curate

the top five developments in exponential technologies

in the world for that week.

Kirill Eremenko: The other one, it might sound like a plug, but I was

actually reading our own newsletter, which we have at

SuperDataScience. It's at superdatascience.com/dsi

for Data Science Insider. Our team curates exactly

what you mentioned like the top five AI, machine

learning, deep learning, whatever else developments in

the world for the past week.

Kirill Eremenko: The value I see in these things, why I can relate to

what you're saying is there's so much stuff going on,

and so much hype, and some things that are just fake

news or incorrect. Some things are insignificant and

stuff. It's much cooler when somebody curates it for

you and like, "Okay. Sinan, here's the top article for

today," or "Here are the top five for this week that you

will probably be interested in." It acts as a filter from


all of these barrage of news that's coming at you

everyday. That would be my take on that.

Sinan Ozdemir: Yeah. No. I think you're absolutely right. I think it's

about getting as many perspective as you can without

overloading yourself.

Kirill Eremenko: Very true. Very true. You should check out ours, the

Data Science Insider. I think you might like it. I'll send

you the link later.

Sinan Ozdemir: Sounds good. Perfect.

Kirill Eremenko: Okay. All right. Well, Sinan, once again, a huge

pleasure. Thank you so much. Very, very valuable

insights, and I can't wait. I'm going to personally

attend your workshop at DataScienceGO. Sounds like

very exciting.

Sinan Ozdemir: Looking forward to it. Thanks so much for having me

on. I can't wait for the fourth time.

Kirill Eremenko: So, there you have it, ladies and gentlemen. That was

Sinan Ozdemir. I hope you enjoyed our conversation

as much as I did. This has been one of the best

podcasts I've had. Sinan just got such a great energy

about him. I love talking to him every single time. Plus,

of course, the content is amazing. How cool is that?

Learned about startups, acquisitions, explainable AI,

case study, used cases, how they're hiring, about

BERT, NLP, state-of-the-art things. I learned so much

about BERT. It's crazy.

Kirill Eremenko: So, yeah, lots of favorite things. If you enjoyed the talk,

make sure to hit Sinan up, connect with him. I'm

going to share his LinkedIn and other places where


you can find him in the show notes at

superdatascience.com/333. That's superdatascience,

one word dot com slash 333.

Kirill Eremenko: If you know anybody who's interested in natural

language processing or who is looking for a job at a

cool company such as Directly, then send them this

podcast. Very easy to share. Send them

superdatascience.com/333. Very easy to remember as

well, triple three.

Kirill Eremenko: Finally, if you want to meet Sinan in person, make

sure to get your ticket to DataScienceGO 2020 US. It's

happening on the 6th, 7th, and 8th of November 2020.

Be there, meet Sinan and lots of other inspiring data

scientists. The last time in 2019, we had people fly in

from 25 countries from all over the world to connect

and network.

Kirill Eremenko: So, this is the place to be. Sinan will be there. He'll be

running at least one workshop, maybe two or maybe a

workshop and a talk or maybe a workshop and a

panel. We will see, but definitely, you'll get to meet him

there and chat to him all about NLP, and hand him

your resume if you want to.

Kirill Eremenko: On that note, thank you so much my friends for being

here, for being part of this amazing conversation with

Sinan. Huge shout out to Sinan. Thank you so much.

A huge shout out to Directly for acquiring Sinan's

startup and sparking this amazing conversation.

Thank you so much, everybody, and I'll see you next

time. Untill then, happy analyzing.


Documents

SDS PODCAST EPISODE 333: BERT AND NLP IN …...Kirill Eremenko: By the way, in this podcast with Sinan, you will hear at the end how he is actually benefiting a lot from newsletter