My posts tend to be 'off the cuff' - meaning I'm just writing out in 'one go' about stuff I'm currently thinking about. Not really a lot of pre-planning (in most cases, save for tutorials). Though I do go back and add bits, correct grammar errors, and put in links, pictures, etc. So apologies if you were expecting highly formalized PR or Marketing spiel. ;) (Yes, I know. You weren't!)

Getting closer - my AR engine now runs on Android

Quick brain dump (so please forgive me if it doesn't seem structured enough...this is 'off the cuff')

I've had my AR plugin running in Unity on Windows for quite some time. It's not that difficult, when your background is as a Windows SDE to write a DLL that links to OpenCV's DLLs. It's still a bit of trouble, as dynamic linking of DLLs to other DLLs can lead to some memory (heap) violation crashes, but debugging is relatively easy.

Android is another story. I only accomplished that about a month ago.

I've been gradually making myself more comfortable writing for Android. It's easy to forget that when letting Unity do the hard work, writing for Android is a lot more complicated when you need to directly interact with the Android system. It's not that hard for Android devs, of course, but for those of us coming from a different background...well, it's a learning curve.

So naturally, I want to write a shared library (DLL in Windows speak) that works and is linked to correctly by my Unity app. The AR framework I wrote is a series of calls into the shared object (library) and scripts that know what to do with the information being fed to the app from the library. The library itself builds upon OpenCV. This is more than just using OpenCV. I've written a few classes and a number of standalone functions that are all designed to serve the needs of the Unity app and the Unity shader that needs to look at the video stream -- post OpenCV processing.

The processing could be as simple as warping the image frame (VFX) or as complicated as analyzing the image for either, camera calibration using a marker set, or recognizing fiducial markers and working out 3D positional and rotational information.  This is just the basics in AR marker recognition. And for those that need code, not words, there are several good OpenCV books out now*  or you can go straight to the OpenCV website.  Look for camera calibration or ARUCO or AR markers.

*If people want I can list off some of the OpenCV books I have. To be honest though, the one I use more and more is the original, Learning OpenCV, (1st ed online, 2nd edition in paperback.) Why? Because they were written by the original researchers that brought us OpenCV and they tended to really explain in terrifying detail the intricacies of the algorithms. As I get further into OpenCV, I find that more and more useful (and understandable). The other books basically are show more than tell, with small code samples but with less explanation than the original texts.

But after a week of intense remote debugging on my Android tablet (from my Windows desktop), I finally worked out what was going wrong with my library. It wasn't crashing but neither was it working.  I'll spare you the details for now. I want to keep this post short.

The main point is that it's working now, recognizing fiducial markers and bringing up scenes based on marker ID.  So now I'm working on camera calibration again. Unless you've got a newer Android device that has camera intrinsics worked out (think Tango or Daydream minimum), then you've got to calculate camera intrinsics.

The biggest downside to this isn't that you might end up with 'not perfect' calculations. No, the biggest issue with camera calibration "on the fly" is that it's interactive. You need the user's help. This is why devices that support Tango or Daydream precalculate and have that information saved on the device.  Or a product like Vuforia actually precalculates a number of different cameras and then uses the one closest.

For the rest of us, for a general solution that works on any Android device, you must ask the user for help. The actual calibration isn't hard, but designing it so that a user (without deep computer knowledge) can follow along and do the right thing at the right time ... that's the challenge.

And that's where I am now. I've worked out the basic procedure and walk the user through it, guiding them and telling them to point or wiggle their device as appropriate -- after they've confirmed that they are seeing markers being recognized.

And that's where I still am. Not the step by step. I've gotten that working. But now, to make the user experience better, I'm trying to find ways not to do it! If I can avoid having to ask the user for help, or at least, can shorten the list of things needed to do, then that's a win.

It also makes me rethink the design of the application. AR is a broad term. It doesn't have to mean marker recognition or SLAM or any number of things.  Do I really need a particular AR feature for the app I'm designing? Is there another way?  Those are the kinds of questions I'm asking myself. And not surprisingly, the answer is "well, maybe not." And that's a good thing.

So the goal is to redesign the app so that the user gets to choose whether or not he/she wants to use a particular AR feature. If they do, then the app informs them what is required. For example, say they want the recognize feature so the creature will dance on the marker in the book they are reading, then they will need to run camera calibration (if their device does not supply valid intrinsics). So that empowers the user and hopefully makes them more willing to do the camera calibration sequence (and bear with the app if their are issues).

Surprisingly, you can get away with very little AR in an AR app and still make it a fun experience. Pokemon Go proved that a basic HUD screen (no recognition, no SLAM, nothing) adds to the user experience of using what is esssentially a GEO app.


Comments

Popular posts from this blog

Getting started with Unity's new Shader Graph Node-based Shader Creator/Editor (tutorial 6 - Getting Glow/Bloom Effect wihout Post-Processing by Inverting Fresnel...Sort Of...)

Getting started with Unity's new Shader Graph Node-based Shader Creator/Editor (tutorial 2 - tiling, offsets, blending, subgraphs and custom channel blending)

Getting started with Unity's new Shader Graph Node-based Shader Creator/Editor (tutorial 5 - Exploring Fresnel/Color Rim and Update on Vertex Displacement Attempts)