Minter Dialogue with Mihkel Jäätma

Mihkel Jäätma is cofounder and CEO of Real Eyes, Attention Measurement, a software that measures human response at the speed and scale of Artificial Intelligence. In their words, they are “the only fully Responsible AI for computer vision.” Real Eyes, which was founded in 2007, has been leveraging the evolution in AI to create a state of the art technology, able to detect emotions, attention and identity, in order to help brands understand the engagement of their advertisements. Real Eyes can also be used to verify identities to detect fraud, such as checking the authenticity of individuals filling in online surveys. We discuss the founding of Real Eyes, its journey across the evolution of AI and how the tech is being used by companies. Given Mihkel’s operational experience and understanding of AI, we look at the challenges and importance of ethics and being a responsible operator. We also peek into the future of AI.

Please send me your questions — as an audio file if you’d like — to nminterdial@gmail.com. Otherwise, below, you’ll find the show notes and, of course, you are invited to comment. If you liked the podcast, please take a moment to rate it here.

To connect with Mihkel Jäätma:

Check out Real Eyes’ site here
Find/follow Mihkel Jäätma on LinkedIn
Send an email directly to Mihkel

Other mentions:

The Power of Understatement in a Crowded Noisy World

Further resources for the Minter Dialogue podcast:

Meanwhile, you can find my other interviews on the Minter Dialogue Show in this podcast tab, on Megaphone or via Apple Podcasts. If you like the show, please go over to rate this podcast via RateThisPodcast! And for the francophones reading this, if you want to get more podcasts, you can also find my radio show en français over at: MinterDial.fr, on Megaphone or in iTunes.
Music credit: The jingle at the beginning of the show is courtesy of my friend, Pierre Journel, author of the Guitar Channel. And, the new sign-off music is “A Convinced Man,” a song I co-wrote and recorded with Stephanie Singer back in the late 1980s (please excuse the quality of the sound!).

Full transcript via Otter.ai

SUMMARY KEYWORDS: ai, people, data, emotions, technology, fraud, learn, identity, business, advertising, feel, surveys, company, level, scale, talk, tests, understanding, convinced, good, Minter Dialogue Podcast, Mihkel Jaatma, Realeyes, Real Eyes, Artificial Intelligence, Computer Vision, Emotion Detection, Brand Engagement, Ai Ethics, Responsible Ai, Identity Verification, Fraud Detection, Online Surveys, Tech Entrepreneurship, Ai Evolution, Ai For Advertising, Facial Coding, Ai Technology, Digital Experiences, Human Response Measurement, Future Of Ai

SPEAKERS: Minter Dial, Mihkel Jaatma

Minter Dial 00:05

Hello, welcome to Minter Dialogue, episode number 559. My name is Minter Dial and I’m your host for this podcast, a most proud member of the Evergreen Podcast Network. For more information or to check out other shows on this network, go visit evergreenpodcasts.com. So, my guest this week is Mihkel Jaatma. Mihkel is the co-founder and CEO of Real Eyes, a software that measures human response to the speed and scale of artificial intelligence. In their words, they are the only fully responsible AI for computer vision. Real Eyes, which was founded in 2007 has been leveraging the evolution in AI to create a state-of-the-art technology, able to detect emotions, attention and identity in order to help brands understand the engagement of the advertisements. Real Eyes can also be used to verify identities to detect fraud, such as checking the authenticity of individuals filling in online surveys. We discussed the founding of Real Eyes, we discussed the founding of Real Eyes, its journey across the evolution of AI, and how the Tech has been used by companies. Given Mihkel’s operational experience and his understanding of AI, we look at the challenges and importance of ethics and being a responsible operator. We also peek into the future of AI. You’ll find all the show notes on minterdial.com. Please consider dropping your rating in review. And don’t forget to subscribe, tick that button, to catch all the future episodes. Now for the show. Mihkel, it is great to have you on my show. I get very nervous when I see two A’s with own lots of dots on top of them, to know how to pronounce it. Yatma or JackMa? You know, it’s all very different with Scandi or Estonian, Finnish languages. We got connected thanks to our mutual friend, your board member, Klaas, who is a classmate of mine from INSEAD. And so listen, let’s start off with who is Mihkel?

Mihkel Jaatma 02:22

Hey, thanks Minter for having me here. My own favorite version of my last name is Jumada. I mean, I still keep collecting them. But there is a there is a good collection of them. I do come from Estonia. That’s absolutely correct. And I would present myself as a tech entrepreneur. So, I’ve been building up this company called Real Eyes, and we are the global leaders in computer vision, I would say we’ve been pioneering a lot of world firsts on, on what can be done over webcams to understand people better and deeper, and then make that useful for various use cases. So, I’m really looking forward to dig into all of that with you.

Minter Dial 03:03

Brilliant. Well, let’s start with you a little bit your personal story. So, you’re Estonian you are based in the UK, as I understand it, what brought you to the UK? And because I mean, the Estonian digital market is vibrant.

Mihkel Jaatma 03:20

Yes. And actually, even before the digital, you know, market was vibrant. I was so happy to grow up in Estonia in 90s, you know, because everything was vibrant. And like the whole country was like a startup, the whole thing was rebuilt every year. So, I guess that, that that DNA stayed with me from the very beginning, I also come from an entrepreneurial family. That’s what my father did, you know, several businesses grew up so. So, I think that was kind of built into me since childhood. And why did I come to UK? I think, you know, I was really honestly looking forward to a good idea, something that would that would have big global impact. You know, why I was it’s known as a wonderful place. But you know, you’ve got to travel and see other people and come across ideas. And that’s why I came just I came to study, so I thought going to school here would be a good idea to meet some interesting people and learn new ideas. And that’s actually how realized got founded.

Minter Dial 04:31

Also, you are cofounder and CEO?

Mihkel Jaatma 04:36

Yes. And my other co founder is Elnar Hajiyev. We were both doing our different degrees back at back at Oxford when our first when we first crossed paths 17 years ago now.

Minter Dial 04:50

So, small trivia piece for me is I’ve been to Estonia maybe three or four times. And I was once invited to interview a woman called Kaja Kallas, who is the current prime minister. And, I mean, I was learning about the Estonian ecosystem. And I was invited to speak there a few times. And my interview with her was it really was remarkable for me. Because if you look at the majority of politicians, they’re all about the narratives, and sort of controlling what’s been happening when they look at someone who’s interviewing them, they want to make sure that they don’t get trapped or anything anyway. So, I was in the greenroom with Kaya. And, and I said to her, Would you like me? Would you like to know what sort of questions I want to ask on stage? She looked at me with this very cheeky grin and said, Absolutely not surprised me on stage. And I thought that that was so refreshing. And of course, the whole Estonian market, the digital ID and I mean, everything is. It’s a remarkable example. So, you come to England, and this idea comes up, how does this idea come to you?

Mihkel Jaatma 06:09

For Real Eyes? Well, I mean, we want to be super specific about it, it was literally a conversation over a pint in the pub, where, you know, probably many things. Start first. Because I mean, back then. And if we go, you know, more than a decade, back then, AI was not as popular word as it is today. And even our phones didn’t have webcams in them yet. So, we didn’t, particularly the stuff that we do today was definitely ahead of the times, but just to learn about the concept of you know, you could actually use personal devices and cameras to understand people at a deep, deeper level. It’s just the curiosity and the potential of it just grabbed us both at the first moment, as soon as we started to talk about that concept was like, literally, we actually had to put together a, it was an exercise, we had to do a business plan, you know, that was part of the curriculum was like, how do you? How do you do a business plan? And then a lot of people did burrito plans and stuff. But we thought like, let’s, let’s actually try to do something interesting. And, and we were talking about what should it be? And then, yeah, that idea of using computer vision to understand people and their feelings and their attention and who they are on a deeper level? Yeah, it was just so interesting.

Minter Dial 07:29

So, this was actually an idea that cropped up while you say the business school?

Mihkel Jaatma 07:35

Yes. So, I was doing business degree, and I was doing computer science, PhD. So, kind of very, very complimentary, from that perspective.

Minter Dial 07:47

Lovely. So, the for me that I mean, I’ve obviously I have no technical understanding of it. But I have to imagine that the challenge at some level is the number of things that you could do. When you when you talk about a camera and looking at breaking down. I’ve talked to friends who, who try to break down eye movements, and what happens in eye movements? Or what happens in the muscles around the mouth? Then there’s all the nonverbal cues, and how do you how did you line up all the options? And then start working on them? What was the sort of roadmap that took that you started with? And where are you now?

Mihkel Jaatma 08:36

Yes, I mean, that that is the really unique thing about the kind of visual understanding of people is that you do get all of this you get the eye movements, the attention level, you get the facial expressions to display different, you know, emotional states, and also just identity, which is actually now become one of the key things just Do you know, is it the robot? Or is it the person? Is it the person who it says it’s that person, I mean, it all comes from these days now from one single model. And you can run it locally on phone device. So, it’s come a long way since back then when you couldn’t actually do it. We were the we started 2007. Two, we started when we founded the company initially, you need special equipment back then to do these kinds of things, infrared cameras and things to the buildup the initial data sets and things which was very cumbersome and limited. And it’s a historical fact that we were actually the very first company in the world to use webcams over internet at scale remotely. That’s it 2011 When we started to do basic emotions, like attention, surprise, confusion, at the scale of 1000s of people. So, yeah, we’ve been adding, you know, capabilities making it work for all the audience groups in real life settings, you know that? It’s a very… there’s so many different dimensions to that story, as you say. So, which is also why it’s always every year of that journey has been, you know, rewarding learning experience along the weeks.

Minter Dial 10:21

Well, yeah, exactly. So, I’m thinking, you know, the ability for? Well, so we’re starting with the technical aspects, of course, a webcam over the internet 2011 is sort of hard to exactly know, or remember how slower things were less good quality, high resolution definition cameras, which are much more accessible, they were far more expensive in the beginning. And if you don’t have high resolution, you don’t have high bandwidth. You it’s very hard to detect the nuance, because I don’t know, I can’t remember the number of muscles that are in the face. But there are many. And there are certainly more than the five basic emotions. And how many emotions do you want to go down? I mean, so just going back to this notion of charting the path, through all of the options, you obviously are going to have the business options and the opportunities that kind of maybe push you in a certain direction, that could be personal interests and desires, there might be things that are easier than harder to do and be more proprietary. So, what were the things? How, what markers were you using to help you chart the path?

Mihkel Jaatma 11:31

Yeah, the very good observations, because all of those things are considerations, right? We knew and you learn every step along the way. So, the main focus of main criteria has always been like the problem you’re solving for who right? And what is it worth? So? So, the very first thing, which kind of naturally emerged was advertising, right? You know, it’s measuring, are people even paying attention to those ads? And do they have any reaction to it? Because if not, you’re wasting billions of dollars. So, so so that was a very obvious first use case. And we quickly validated market fit there. Because essentially, we were, you know, going beyond the interviews and surveys, which were not always very indicative of what’s good or not, you know, it’s actually there’s much more signal in measuring this unfiltered behavior. So, the ad testing product became our first product that kind of paid our bills, and it’s still a very, very important part of our business, we still do that. Because there’s still even more ads. Now, you know, that that whole space of advertising itself is, is actually like a perfect learning ground for any new technology, because it moves very fast. And there’s a lot of data and actually a lot of willingness to test and innovate. So, it’s been great to be in advertising use cases as the first step. But yes, we we’ve added, next use cases now to that the beyond, it’s coming from that, which I’m happy to talk more about as well, if you’re interested.

Minter Dial 13:10

Well, surely I mean, hopefully, we’ll get into a few more of them. But in terms of observing faces, you don’t have a psychology degree, you’re not a Doctor of Medicine. What have you learned about humanity through your development of this technology.

Mihkel Jaatma 13:36

So, we have engaged the world best to advise us since day one, you know, professors will work with psychology professors who built the first automatic facial coding in the very beginning just to learn about the space because you need to know what you’re doing. So, it was a, it was a critical component on top of it. And then we had to bring in the best engineers in the world to deal with all those latencies and, and resolution issues on top of it. So, that those were the kind of the first years of where we actually had to get the machine up to doing this level of millions of measurements a year that we do now, which is probably I mean, I don’t know for sure. But it might be the biggest emotion data set there is to actually, you know, measure human response to digital experiences, not just that we do other things now as well. So, what have we learned from this? I mean, it’s contextual, right? You know, are you dealing with advertising effectiveness? are you improving mental health? You know, are you trying to make your experiences convert better and keep customers loyal? There are different emotional signals actually, and attention signals that help you do all of those things better. So, you know, just one way to think about it is like, you know, everyone has user feedback. Right, you know, most often it’s just clicks and likes or, you know, interviews or some sort of feedback like, like that, we just add a much deeper understanding to it. So, because internet so far has been, you know, without the body language, I mean, if people talk to each other, they, they see beyond the words, internet today, still pretty much words, right. So, there’s a lot of lot to learn to see deeper and understand your users if you have that visual understanding as well. So, that’s the foundational basis. Now, if you go back to our, you know, first business in advertising effect effectiveness, then, you know, making people smile is a very powerful thing. It’s not easy to achieve with just showing ads. But if you do, and that positive engagement rolls over to when you show your brand, it drives your sales, you know, we have, we have proved it for years, but the biggest brands in the world who, who are now using us as the sort of the measurement standard, or what’s a good ad or not so, so, so often the learnings are quite sort of common sense. And they make sense. But, you know, if you go to like, running a big business, this kind of signal hasn’t been available in scale, right? It’s been some sort of one-off lab stuff. But more and more, it’s available, just like any other digital signal, and, you know, that has the potential to make everything work better.

Minter Dial 16:30

So, what I was trying to get to me and make was thing, something like, right, you’re studying all these emotions, you’re being able to encode that this looks like a smile, this looks like a tear, or whatever other emotions you’re driving at. But then, when you go to a dinner party, or you’re speaking to somebody now in real life, like you and I are right now, to what extent is that informed your observation of the emotions of the person in front of you?

Mihkel Jaatma 17:01

I do think the human to computer interaction is still very different than person to person like this. And so far, yeah. And most of the application has been on how do you make the digital experience, essentially not to frustrate you and be a positive thing and being able to not become so boring that you just walk away? So, I don’t think it directly translates that vision technology, translation to dinner table, as much as you know, working on advertising effectiveness, you still learn about the story arcs and things which matter everywhere. So, that that parallel is easy to see. Now you just have a better technology to, to quantify and visualize it second by second when you’re when you work with your marketing content, for example.

Minter Dial 17:51

Well, maybe the way into this question, for me comes from the fact that having studied the idea of encoding empathy into a machine, it’s made me and people who are working on it far more aware of what is empathy, and any with emotions, for example, with some of the psychologists that I’ve talked to, it seems that outside of the specialists, most of us are actually quite confused as to which emotions we’re feeling, which emotions, we’re observing the difference between frustrated and irritated, sad and depressed, well, you know, the splitting of those hairs for some, it can be quite a challenge and lead to miss miscommunications. So, I’m wondering to what extent that might have been something you personally have been a journey on.

Mihkel Jaatma 18:53

That actually, that journey has been one of simplification rather than going deeper into the deeper nuance, because we’ve tried, you know, you know, going into these nuances of emotions is a very complex story. And it gets very contextual very quickly. Whereas, you know, if you really want to actually solve problems, then being able to just detect like, positive and negative everywhere all the time is a much more valuable and powerful thing than then then getting into that level of, of nuance. So, we was in the earlier years when we had like 50 different metrics and things now we talk about identity, attention, positive negative emotions, and there’s like, maybe three shades of negative that are worth looking into confusion and disgust in different ways. So, that kind of in the world of making the digital experiences better for people that gives you already pretty much all you need.

Minter Dial 19:57

Well, that makes tons of sense and the same thing at some level with regard to empathy context is everything. When you’re trying to understand somebody, you know, if they’re angry, well, you need to know why they’re angry and such. But, um, that’s fascinating. So, in terms of technology, of course, I’ve had a chance to work with HAI Vision, a company out of Canada, that that works with this latency issue. And so, I got this understanding of, of the technology of H-264. And the technology that’s generally the most basic in for encoding and decoding the video that’s being transmitted through H-264. Technology. But you’re dealing with presumably computers and webcams of all varieties. And so, you can’t if you can’t send out the cat, the good camera for the right thing everywhere. But do you find that I mean, today that your life has been simplified, as well, by the number of devices that are out there, and the number of laptops, desktops, and phones that come with good cameras?

Mihkel Jaatma 21:12

Yeah, that is has actually absolutely been a, one of the core pillars since beginning is that it’s completely hardware agnostic. What we do, and initially, it was just like a scale issue. But in what we do computer vision AI, the thing that I’m most passionate about now, and I think will be more important going forward is this whole concept of responsible AI. And there’s like, people are, you know, all the OECD and European Commission’s, and they all have, you know, different ways of describing what it means and what is trustworthy AI. In our world, there are three things and one of those is what you what you would ascribe I’ll come back to that. But one is fairness, really, that making all of those algorithms work, as well as it can, across all audience groups doesn’t matter what your cultural background is age and gender, just making sure that it works for everyone. Vision, AI has gotten deservedly a lot of bad reputation on being biased, and it was historically. So, being fair is one of the pillars. The second is the robustness thing that you mentioned. So, it needs to work in real time, real life, you know, on poor devices in every country in poor lighting conditions, motion blur, that is the area where I think we’ve as a company, we’ve done the most probably of any one of those, because that’s the environment we’ve worked in since day one, it’s like, that’s all you have whatever you have on your phone, or five year old laptop, that’s, that’s the input that we need to need to work against. So, managing fairness is difficult, managing robustness is difficult, but doing them both at the same time, is exponentially more challenging. So, So that’s, that’s why we keep investing in it. And the third part, just to finish up, the whole responsible AI part is actually just the legality of the data, right? Because if you go into the vision AI space, then most of this data has been borrowed on very great grounds. It doesn’t involve GDPR level explicit consent from the user, that I’ll take this from Garin or use it for that model. So, it’s not really legal. Whereas what we’ve done now is we’ve actually paid money to 12 million people to get their video recordings with a full consent to train AI. So, so it’s also legally clean. And you know, putting those three components together is what hopefully sets a high standard for the whole industry going forward.

Minter Dial 23:46

Is that a standard that your clients are willing to pay for? Because let’s say I’m sure that there’s some other actors who might not have such a sense of responsibility, who might just scrape a whole lot more images, not bother paying or worrying about legalities? And to the extent that you’re actually paying these people that must flow through into the cost for the customers? does is that a discriminating factor?

Mihkel Jaatma 24:16

I think it will be the main buying criteria of any AI service in two years. We our customer base is like literally the top enterprises in the world, the biggest tech platform, the biggest brand. So, they have started to deal with that issue on a one on one sort of procurement relationship basis. And it’s a very difficult thing to assess. And, you know, a lot of questions running those tests is it’s complex. So, it’s worth for those big companies to do it. Because they, you know, they care about those things, but it doesn’t translate to the rest of the, the market. And that’s something that we actually want to do as a company much more is make this kind of evaluation, like free frictionless forever. We want to put their computer vision algorithms and see how they fare against it. So, you know, we’re still scale up, or however you want to call it. So, we don’t have, you know, that much time to work on it on our own. But I think in a year or two, we’ll have something like that out as a sort of a beneficial tool for everyone to make progress faster.

Minter Dial 25:23

So, some sort of open platform?

Mihkel Jaatma 25:27

Yes, to evaluate your computer vision algorithms, you know, are they are they illegal? Or they’re robust? Are they fair? You know, just put your model in and see immediately.

Minter Dial 25:37

At that level? Yeah, so, yeah. And with regard to the 12,000, was that a financial or mathematical or scientific number, in terms of getting to have sufficient breadth of diversity?

Mihkel Jaatma 25:55

Well, it’s 12 million altogether as the universe of the videos.

Minter Dial 25:58

12 million, sorry!

Mihkel Jaatma 25:59

Yes. And we, we, I mean, within that we’ve have created sort of balanced data sets, which are carefully selected sub segments of it, of different backgrounds. So, we can measure whatever algorithm we have across that we will not know 12 is where we the entire base that we can learn from, and then we have a small subset of it that are testing data set to, to, to measure ourselves against. So, that’s taken out to run this kind of measurements.

Minter Dial 26:31

Yeah, I was, I was thinking in my mind about a kind of an app. I don’t remember where it was started, but somewhere in Asia called beauty.ai. Did you ever come across that one?

Mihkel Jaatma 26:45

Not yet now, but…

Minter Dial 26:49

it was a design was send in your photograph, and an AI will determine where you rank in terms of beauty. It was, so there was this cheeky desire to want to, obviously be considered beautiful, but the highest scoring people were all white? A couple were Chinese or Asian looking. But anyway, yeah, they got they got duly screwed for their lack of diversity and fairness. So, in terms of business models, we’ve talked a little bit about the advertising piece, which of course, is sensitive to me, because I worked for L’Oréal, and one of the top five advertisers in the world. So, very, very familiar with concepts of making good ads and all that. What are the other business models? I have a feeling fraud has to be one of the more.

Mihkel Jaatma 27:46

Yeah, it actually is yes, because I think this whole AI explosion is still playing out. And they will play out in many unexpected ways. But one of the effects of it for us is that over the last, you know, essentially, after the cat GPT, stuff came out, the level of fraud has just exploded, because the tooling to do that is so much better and keeps on getting better. So, So what, so it doesn’t really matter what you do online, whether you run surveys or sell tickets, or rely on user reviews, or marketplaces, it’s like, everywhere, like fraud is up, because it’s so much easier to do it with the latest AI tools, if you’re a bad actor. And, you know, it’s, it’s, it takes, you know, many tools to deal with it, right. And face verification has been one of the things that’s been used decades, right, when you cross borders, or when you want to get into the bank, like it’s been a well-established way to do it. But it’s, it’s pretty, you know, if you’re a user, it’s a, it’s a cumbersome experience, it takes time. And you know, people wouldn’t want to do it as part of their normal digital experiences. So, we made this face verification technology, like a two second experience that you can plug and play into any digital experience to have that same level face verification. Way to manage identity. And that’s like picking up especially in the first use case for us has been in literally running surveys. It’s a very large industry, right. 10 billion is spent on running surveys every year. And what we see in our data right now is like 40% of those surveys are not what you buy, you know, they’re either bots, they’re one person doing it 10 times there are people lying about their age and gender massively. So, there’s a lot to clean up there. And that’s kind of one of the use cases which is which is picking up well. For us now just managing identity because you know Are you no harder to tell who is verboten? Who is person online these days?

Minter Dial 30:05

Well, this brings up the most famous or famous book by Philip Dec about the Androids Dream of Electric Sheep. And in this story the androids become so good, it’s very hard to detect and the piece that they use is this Voight Kampf test, which essentially tries to understand if the being in front of me is an Android or real through whether it has or not empathy. And so that’s a funny story. But it so the notion of fraud is maybe obviously in a in a legal side, but in the notion of what you do with Kantar, the survey providers, I mean, I think they do other things like consulting to it. What you what you’re trying to do is not necessarily to tech fraud, but render more precise, the surveys. And so, understanding whether it is or not a bot is one thing it is or not a man it is or not an older man or a younger man or it. Is that what you’re trying to do? Is that what you’re basically doing with Kantar?

Mihkel Jaatma 31:16

Yes, so I mean, the question is, like, where does fraud start? And where is it fake and wrong data, right? Because it costs 10s of millions, per this company on actually just paying out survey incentives to these click farms, which are being operated from different parts of the world. So, that so it is a hardcore fraud from that perspective, but you’re absolutely right, that the country is the most trusted consumer data company, right? All the brands, pretty much rely on their decision. So, those surveys pays the influence, I don’t know trillion dollar plus in terms of branding and marketing decisions. And if you do a Docker noisy data, then you make wrong decisions. And that costs a lot more than a couple of 10s of millions of observing fraud issues. So, it has the bigger side of, of making right decisions on actual real data is worth a lot more.

Minter Dial 32:19

And what about identity, for example, for the government or for political votes? Is this an area that you’re exploring? Or you’re involved with? Or is that is that maybe a future idea?

Mihkel Jaatma 32:34

No, I mean, identity is such a broad space. So, I think what roughly I would say that, you know, there’s already many good solutions available for what we say, sort of high, high, higher stakes, things, and government and boating and banking access, you know, there’s just technologies and solutions out there. What’s missing is, is lower stakes things, the normal consumer experiences, online reviews, surveys, you know, what’s buying tickets, so that’s what we call lightweight identity, where essentially, we’re just trying to be a better capture. Because capture is already AI can go throw it anywhere better than people. And it’s pretty annoying for users. So, we can make it simpler and more safe at the same time.

Minter Dial 33:20

Alright, so let’s just go through that because captcha, a technology — for people who aren’t exactly familiar with that – that would basically be, for example, see that, you know, or chain, I think, anyway, put this jigsaw puzzle in the right place, or click this button that says, I am a human? Or I’m not a robot or other methods of establishing my humanity versus my robustness? I don’t know, I don’t know to what extent captured as all of those but what is the difference with yours? And how does it how would it work on my phone or my laptop, for example, compared to that type of capture.

Mihkel Jaatma 34:08

So, as a as a user, it will be several times easier for you, you don’t need to chase the traffic lights from the pictures. All you need to do is read our GDPR compliant consent that we click here to verify. And it’s a two second experience. And that’s all you need to do. So, it is much quicker, much simpler for you. And what we do on background is that we are algorithms create a mathematical mesh called off your face called facial embedding. We’re not we’re not doing any pictures, actually, it’s premise to say from that point of view that it’s just mathematics with your face that we can then store and do checks against if you’re if you’re a real person, and are you the same person like you you’re supposed to be.

Minter Dial 34:53

Well, so are there bad actors that are sort of creating fake faces?

Mihkel Jaatma 34:59

Yes, yeah, no, that is a big part of any fraud and all, it’s always a race, arms race. So, deep fakes and doing photos. So, there are sophisticated fraudsters have those tools available. So, if you want to go into your bank account, you need to have more tests than that you need. There are things called liveness tests and, and things, which we also have that we can turn on the as-needed basis if we need to verify or if we have reason to suspect that this is not, you know, a real person there. So, there are different layers of things that you can switch on. So, we can go beyond the first picture to do sort of an ongoing liveness test. And then what is really unique about us is that we can continuously continue tracking that because most of this identification right now is still one spot check, right? When you create an account, you do it. But then if you’re using that, let’s say for Child Protective child safety regulation, for example, is one use case. I mean, you might have a phone and you verify it’s you. But then your child comes and uses your phone and looks at all sorts of stuff that it shouldn’t look, you know, this continuous tracking is currently not deployed at scale by anyone yet, but we have the technology, and we think it will be in coming years.

Minter Dial 36:26

Well, that sounds like something that’s a little bit higher stake.

Mihkel Jaatma 36:32

Well, higher stake. Yeah, it’s how do you define it? I mean, we say the higher stakes is really like border crossing and, and KYC level financial services stuff. Whereas if you talk about digital ecosystems, and you know, content moderation and things like that, that’s still like this, like, you have to do like 1000 tests per day, right? So, you can you can go through the banking level verification for all of them. While

Minter Dial 37:01

I was thinking more about protecting a child from, let’s say, pornography or something.

Mihkel Jaatma 37:06

Yeah. So, so that’s what continuous tracking would help you do is like, you know, make sure that, that if there’s another person on the face was not authorized, cannot do that.

Minter Dial 37:19

I want to talk in the for the last part of what we have together here talk about AI itself. So, obviously, it’s a huge part of what you do. It’s kind of all over the news. And everybody knows that it exists. For you. The arrival of these large language models, was that significant to you? Yeah. And to what extent have you served on that?

Mihkel Jaatma 37:42

No, I think, for everyone it is there’s no business that can say that. We haven’t changed. Maybe headdresses for now. But, but still the yes, even in this advertising side that we do. They’re the obvious first thing is still like, you can automate a lot of stuff that you had to build software, or people had to do some sort of decision making by recommendation. So, the best, the quickest thing that we implemented was like, Okay, we have all this, here’s the ad, here’s the response. But what should I do differently, right? That used to take a consultant to look at the data and lines and make conclusions. Now it takes two seconds for the language model to do that, and come back with some pretty amazing recommendations. And obviously, the next step is like, Okay, I have the recommendation. But now I put it into the content generation engine, and the machine suddenly becomes end to end super smart. And that’s, that’s pretty exciting. I mean, nobody’s the tests are there, nobody’s really done it on scale. And all this AI still runs against what we would say limited or dumb data, it’s all just clicks, which is better than nothing. But putting that sort of human response data at the sort of synthetic prediction scale into those engines is super exciting. And just kind of the main thing, actually, that we’re doing right now.

Minter Dial 39:17

And many people will I’m sure I’ve asked you, you must know a lot about AI. Are you an optimist about how it’s going to be used? Or do you have a generally negative opinion of how it’s going to be played out? How AI is going to play out?

Mihkel Jaatma 39:38

Yeah, I’m taking the color system in an approach to that since beginning which is sort of you have to be proactive about those things. It’s it doesn’t solve any issues to theorize about it. Is it good or bad? This thing is out. It’s evolution. It’s going to keep on going. What you can do about it is make it as good as it can be, which is why that responsibility, I think, that I mentioned is so core to us. So, be a good guy in it and try to be as influential as possible and get it out there.

Minter Dial 40:14

Yeah, I mean, obviously, I mean, I tend to agree with you. The fact is that every technology has been used and misused and abused, and most often by marketers would say, they have as terrible tendency to wish to go find the easiest way to screw it up and lose the trust of their customers. But in terms of what’s happening with AI, it seemed to me that the arrival of this generative AI, was because we were able to merge so many different aspects, including the arrival of the processing power, the cloud, and the massive amounts of data. I’m wondering, what are the next steps with AI? Do you have a vision on where it’s going, I have to imagine you’re at the forefront of where this is going. But if you’re cleaning the cleaning data, having more proprietary data, what should be people who are listening to this be thinking about in terms of their approach to AI?

Mihkel Jaatma 41:21

My experience is that the more frontline you are the, the more cautious you are giving any predictions that go beyond a month or two, because it really changes and it’s even the foundations of how this whole business model of LLMs is going to work out does it get you know, all the money and effort is going into those big models, but is that what kind of when or, and what we’re seeing is that there’s a lot of value to be added still in, you know, just this downstream application of them. So, you use that big brain as part of it, but at the end of the day, it’s still the unique data that you somehow feed into that, that makes it work better and make it make, you know, ask the right questions from it in the form of some sort of design of a system solve a particular problem. And right now, it looks like there’s going to be plenty of those big foundational models to tap into, which are good enough. And then the much bigger difference would come from having some special data and knowledge of, of how to deploy that to solve a specific problem. So, I hope it goes that way. But you know, this, it got us all by surprise of how good it is, right? So, if if Sam Altman is right, and he told the next version is even orders of magnitude better than everything changes, right, it doesn’t matter how well we will we think or plan here. So, some definitely, you know, I’m optimistically worried as possible.

Minter Dial 42:52

I think like, like some moments business stay open. Because there’s a lot there’s a lot of changes around the corner. So, for Real Eyes, your typical customers, then you have the people who are involved with checking identity, you have the people who are involved in branding, checking with the success of advertising, any other big customer segments?

Mihkel Jaatma 43:13

Well, I mean, we started from advertising, we built our training data there. Now we saw these identity problem that has picked up on consumer scale, the thing where we see, the next big thing for us is this kind of coming back to that comment that I made about internet is still text and not body language. And that’s going to change right, you know, right now we talk and typing into this chat GPT apps, they’re going to look like other people. And they’re going to be multimodal. So, these apps that we use and will rely on more and more, they will need some human feedback to be as good as they can. So, becoming like the best component or the fairest models, the most robust models to for those agents to understand our response, like who’s saying what, and how are they feeling about it, I think that will be the next big thing because the UX the interface will change from what we do today. Progress as fast as the whole light core AI technology came in over the next couple of years.

Minter Dial 44:22

So, if I want to get that straight, I just been writing actually about the power and the need for the word. And the desire for more correct spelling, is if you screw up one letter, sometimes that makes just the world of difference. But it does seem to me that even if we’re going to anthropomorphic bots if you will, or you know maybe something it looks more easy to relate to. There’s still a translation through the word it seems at least that’s the way we operate as human beings what you know you have an emotion Some, you are then able to express it or concretize it by saying, Oh, I am feeling happy. That’s this five-letter word. And, and this is a manifestation, and I’m able to say what it is and to communicate it around. So, it feels like the word will remain. And you know, every visual, every image needs to be tagged in order for the machine to understand it. And that tagging is a word-based process. Do you see this? Is that quantum physics that map quantum mechanics or computing this can actually change that or, and the word will be eliminated?

Mihkel Jaatma 45:38

No, it depends on what level I mean. You don’t need to tag these many things, you already the formation models tag, there multiple understandings already so good that you don’t need to do the tagging for many things. But I mean, multimodality will win, right? If you talk about avatars or agents in the world, I mean, what do you say to them? Whether your type or say it, the text part is still, you know, foundational, right? But then, you know, it would be better if that agent also understand if you’re frustrated or happy about what he’s telling you back, right? Because right now, there’s this thumbs up thumbs down. Feedback, which nobody almost ever clicks, and it’s very blunt anyway. Whereas like, imagine now every single frame, you get much deeper understanding of is my response going in the right way or not? So, I think adding that feedback loop will make this whole text thing much more aligned with the agents.

Minter Dial 46:40

The context of the text, in fact, yeah. Beautiful. All right. Well, Real Eyes. Let’s just talk last minute about the let’s say the size, your business and funding. Where are you in terms of funding going public? Raising money? I’m sure that’s a big old topic as well.

Mihkel Jaatma 47:04

Yeah, we’re in operatively, sort of off commercial proceeds for a while. And that’s always good to have in the bucket. It is an interesting year, though, I think in terms of what’s going on in the market, you know, we, our customers are these big end enterprises that I mentioned, right? So, when you have client relationships, then then sort of corporate development conversations are always a natural part of it. So, based on that seems like this year, there will be more action than there has been maybe on the market level as well, because people have been cautious for a couple of years. And generally, it looks like everything’s going up for now. So, we’ll see how that goes out. Up for us. But we’re building an independent tech platform that wants to be the visual human understanding component for the future of how the general Gen AI ecosystem will work. So, that’s, that’s what we set out to do.

Minter Dial 48:02

Exciting stuff. If you had a magic wand, who would you like to be listening to this podcast to call you?

Mihkel Jaatma 48:13

That’s a very good question. You know, the best answer to that is always a surprise. Coming back to your surprise me comment that I liked in the beginning, you know, might be maybe someone who wants to join the team and make it real. I mean, obviously, you know, customers like big corporations and then wanting to understand them use is better I mean, it’s there’s a lot we could do there quicker and faster. But I I’d like it to be a surprise that I’m not thinking of if I want which

Minter Dial 48:43

is if you’re listening, you can call MC is waiting for your great many, many thanks. So, how can people get in touch with you? Or at least your business? What’s the best ways to what lengths would you like people to get sent to?

Mihkel Jaatma 49:02

Mihkel@realeyes.com. You can find me on LinkedIn, I use that very actively. So, please to use either of those.

Minter Dial 49:11

Beautiful, I’ll put those in the shownotes. Mihkel, many, many thanks. It’s very exciting to hear you what you’re up to. I look forward to understanding more following you. Seeing what the where it all goes. So, many thanks and be in touch.

Mihkel Jaatma 49:28

Thanks so much Minter for having me. Talk to you soon. Bye.

Minter Dial 49:33

So, a really heartfelt thanks for listening to this episode of The Minter Dialogue podcast. If you liked the show, please remember to subscribe on your favourite podcast service. As ever, rating and reviews are the real currency of podcasts. And if you’re really inspired, I’m accepting donations on www.patreon.com/Minterdial. You’ll find the show notes with over 2100 blog posts on minterdial.com on topics ranging from leadership to branding, tech and marketing tips. Check out my documentary film and books including my last one, the second edition of “Heartificial Empathy, Putting Heart into Business and Artificial Intelligence” that came out in April 2023. And to finish here’s a song I wrote with Stephanie Singer, “A Convinced Man.”

I like the feel of a stranger

Tucked around me

Precipitating the danger

To feel free

Trust is the reason

Still I won’t toe the line.

I sit here passively

Hope for your respect

Anticipating the thrill of your intellect

Maybe I tell myself

There’s no use in me lying.

I’m a convinced man,

Building an urge

A convinced man,

To live and die submerged.

A convinced man,

In the arms of a woman

I’m a convinced man

Challenge my fate

I’m a convinced man

Competition’s innate

A convinced man

In the arms of a woman.

Despise revenges

And struggle to see

Live for the challenge

So, life’s not incomplete

What’s wrong with challenge

I know soon we all die

I’m a convinced man

Practicing my lines

I’m a convinced man

Here in these confines

A convinced man

In the arms of a woman.

I’m a convinced man

Put me to the test

I’m a convinced man

I’m ready for an arrest

I’m a convinced man

In the arms of a woman.

I’m a convinced man… so convinced

You convince me, yeah baby,

I’m a convinced man

In the arms of a woman…

Minter Dial

Minter Dial is an international professional speaker, author & consultant on Leadership, Branding and Transformation. After a successful international career at L’Oréal, Minter Dial returned to his entrepreneurial roots and has spent the last twelve years helping senior management teams and Boards to adapt to the new exigencies of the digitally enhanced marketplace. He has worked with world-class organisations to help activate their brand strategies, and figure out how best to integrate new technologies, digital tools, devices and platforms. Above all, Minter works to catalyse a change in mindset and dial up transformation. Minter received his BA in Trilingual Literature from Yale University (1987) and gained his MBA at INSEAD, Fontainebleau (1993). He’s author of four award-winning books, including Heartificial Empathy, Putting Heart into Business and Artificial Intelligence (2nd edition) (2023); You Lead, How Being Yourself Makes You A Better Leader (Kogan Page 2021); co-author of Futureproof, How To Get Your Business Ready For The Next Disruption (Pearson 2017); and author of The Last Ring Home (Myndset Press 2016), a book and documentary film, both of which have won awards and critical acclaim.

👉🏼 It’s easy to inquire about booking Minter Dial here.

View all posts by Minter Dial

The State of the Art of Computer Vision Tech with Mihkel Jäätma, cofounder and CEO of Real Eyes (MDE559)

Minter Dialogue with Mihkel Jäätma

To connect with Mihkel Jäätma:

Other mentions:

Further resources for the Minter Dialogue podcast:

Full transcript via Otter.ai

Minter Dial 00:05

Mihkel Jaatma 02:22

Minter Dial 03:03

Mihkel Jaatma 03:20

Minter Dial 04:31

Mihkel Jaatma 04:36

Minter Dial 04:50

Mihkel Jaatma 06:09

Minter Dial 07:29

Mihkel Jaatma 07:35

Minter Dial 07:47

Mihkel Jaatma 08:36

Minter Dial 10:21

Mihkel Jaatma 11:31

Minter Dial 13:10

Mihkel Jaatma 13:36

Minter Dial 16:30

Mihkel Jaatma 17:01

Minter Dial 17:51

Mihkel Jaatma 18:53

Minter Dial 19:57

Mihkel Jaatma 21:12

Minter Dial 23:46

Mihkel Jaatma 24:16

Minter Dial 25:23

Mihkel Jaatma 25:27

Minter Dial 25:37

Mihkel Jaatma 25:55

Minter Dial 25:58

Mihkel Jaatma 25:59

Minter Dial 26:31

Mihkel Jaatma 26:45

Minter Dial 26:49

Mihkel Jaatma 27:46

Minter Dial 30:05

Mihkel Jaatma 31:16

Minter Dial 32:19

Mihkel Jaatma 32:34

Minter Dial 33:20

Mihkel Jaatma 34:08

Minter Dial 34:53

Mihkel Jaatma 34:59

Minter Dial 36:26

Mihkel Jaatma 36:32

Minter Dial 37:01

Mihkel Jaatma 37:06

Minter Dial 37:19

Mihkel Jaatma 37:42

Minter Dial 39:17

Mihkel Jaatma 39:38

Minter Dial 40:14

Mihkel Jaatma 41:21

Minter Dial 42:52

Mihkel Jaatma 43:13

Minter Dial 44:22

Mihkel Jaatma 45:38

Minter Dial 46:40

Mihkel Jaatma 47:04

Minter Dial 48:02

Mihkel Jaatma 48:13

Minter Dial 48:43

Mihkel Jaatma 49:02

Minter Dial 49:11

Mihkel Jaatma 49:28

Minter Dial 49:33

Minter Dial

Submit a Comment Cancel reply

Most Popular Blog Posts

Pin It on Pinterest