Upcoming and OnDemand Webinars View full list

Using FirebaseMLKit with CameraX

Andrew Marshall

So you’ve watched the CameraX introduction at Google I/O 2019 and you saw all the cool image manipulation and face detection implemented in the demo apps. Then you worked through the CameraX CodeLab, but the analysis that they demonstrate in that app just calculates the luminosity. What if we want something a little flashier?

Fortunately, we can make use of another one of Google’s libraries, Firebase ML Kit. ML Kit makes face detection super simple, and CameraX’s analysis step makes it easy to feed images to the face detector. Let’s see how to combine the two to detect the contours of a person’s face!

The setup

Our MainActivity will handle asking for permission to use the camera, and then delegate to CameraFragment when permission is granted:

private const val CAMERA_PERMISSION_REQUEST_CODE = 101

class MainActivity : AppCompatActivity() {

  override fun onCreate(savedInstanceState: Bundle?) {
    super.onCreate(savedInstanceState)
    setContentView(R.layout.activity_main)

    if (hasCameraPermissions()) {
      supportFragmentManager.beginTransaction()
          .add(R.id.content_area, CameraFragment())
          .commit()
    } else {
      requestPermissions(arrayOf(Manifest.permission.CAMERA), CAMERA_PERMISSION_REQUEST_CODE)
    }
  }

  private fun hasCameraPermissions(): Boolean {
    return ContextCompat.checkSelfPermission(this, Manifest.permission.CAMERA) == PackageManager.PERMISSION_GRANTED
  }

  override fun onRequestPermissionsResult(requestCode: Int, permissions: Array<out String>, grantResults: IntArray) {
    super.onRequestPermissionsResult(requestCode, permissions, grantResults)
    if (requestCode == CAMERA_PERMISSION_REQUEST_CODE) {
      if (grantResults[0] == PackageManager.PERMISSION_GRANTED) {
        supportFragmentManager.beginTransaction()
            .add(R.id.content_area, CameraFragment())
            .commit()
      }
    }
  }
}

Of course, for the above to work, our app will need the permission declared in the manifest:

<uses-permission android:name="android.permission.CAMERA">

CameraFragment will handle initializing the CameraX use-cases and binding them to its lifecycle. For now, the layout for the fragment, fragment_camera.xml, just consists of a FrameLayout containing a TextureView:

<FrameLayout
  xmlns:android="http://schemas.android.com/apk/res/android"
  android:layout_width="match_parent"
  android:layout_height="match_parent">

  <TextureView
    android:id="@+id/camera_view"
    android:layout_width="match_parent"
    android:layout_height="match_parent" />

</FrameLayout>

We’ll use the TextureView to display the SurfaceTexture representing the camera output from CameraX’s preview use-case. We get a reference to the TextureView in the set up methods of CameraFragment:

class CameraFragment : Fragment() {
  private lateinit var cameraView: TextureView
  override fun onCreateView(inflater: LayoutInflater, container: ViewGroup?, savedInstanceState: Bundle?): View? {
    val view = inflater.inflate(R.layout.fragment_camera, container, false)
    
    cameraView = view.findViewById(R.id.camera_view)
    
    return view
  }
}

Add CameraX

First, we’ll need to add the CameraX dependency. I’ve found that it works most consistently if you add the Camera2 dependency as well:

def camerax_version = "1.0.0-alpha02"
implementation "androidx.camera:camera-core:$camerax_version"
implementation "androidx.camera:camera-camera2:$camerax_version"

We’ll then add a method to set up a CameraX instance to be associated with CameraFragment:

override fun onCreateView(...)
  ...    
  cameraView.post { 
    setUpCameraX()
  }
  
  return view
}
private fun setUpCameraX() {
  CameraX.unbindAll()
  val displayMetrics = DisplayMetrics().also { cameraView.display.getRealMetrics(it) }
  val screenSize = Size(displayMetrics.widthPixels, displayMetrics.heightPixels)
  val aspectRatio = Rational(displayMetrics.widthPixels, displayMetrics.heightPixels)
  val rotation = cameraView.display.rotation
}

We need to establish the size, aspect ratio, and rotation of our target view so that we can properly configure the CameraX use-cases. By calling setUpCameraX() from within cameraView.post(), we ensure that it doesn’t get run until the view is completely set up and ready to be measured.

Build the CameraX use-cases

Since eventually we want to draw the detected face contours on the preview image, we need to set up the preview and analysis use-cases together, so we can transform their output for proper display. We also need to be able to resize and rotate everything properly when the device is rotated.

To encapsulate this logic, we’ll make a utility class called AutoFitPreviewAnalysis. If you’ve checked out Google’s CameraX sample project, you may have seen their AutoFitPreviewBuilder. Our AutoFitPreviewAnalysis will be a modified version of that class, so we’ll start by copying that class into our project.

Go ahead and change the class name to AutoFitPreviewAnalysis. Since we’re creating both a Preview and Analysis use-case, let’s change build() to take the configuration parameters from CameraFragment and simply return an instance of the class:

fun build(
  screenSize: Size, 
  aspectRatio: Rational, 
  rotation: Int, 
  viewFinder: TextureView
): AutoFitPreviewAnalysis {
  return AutoFitPreviewAnalysis(config, WeakReference(viewFinder))
}

We now have everything we need to create the configuration objects for both the Preview and Analysis use-cases:

private fun createPreviewConfig(screenSize: Size, aspectRatio: Rational, rotation: Int): PreviewConfig {
  return PreviewConfig.Builder().apply {
    setLensFacing(CameraX.LensFacing.FRONT)
    setTargetResolution(screenSize)
    setTargetAspectRatio(aspectRatio)
    setTargetRotation(rotation)
  }.build()
}
private fun createAnalysisConfig(screenSize: Size, aspectRatio: Rational, rotation: Int): ImageAnalysisConfig {
  return ImageAnalysisConfig.Builder().apply {
    setLensFacing(CameraX.LensFacing.FRONT)
    setImageReaderMode(ImageAnalysis.ImageReaderMode.ACQUIRE_LATEST_IMAGE)
    setTargetRotation(rotation)
    setTargetResolution(screenSize)
    setTargetAspectRatio(aspectRatio)
  }.build()
}

Since we’re going to need to map the contour points from the analysis to the preview image, it’s important that both preview and analysis are set up with the same target resolution and aspect ratio.

We also set the analysis’s imageReaderMode to ACQUIRE_LATEST_IMAGE, which always returns the latest image, discarding any others. This will keep the analysis working on the most up-to-date frame, without clogging up the pipeline with old frames.

For simplicity, we’ll hard-code the camera to the front (selfie) camera.

Create the configuration objects in build(), then pass them to the AutoFitPreviewAnalysis constructor. Change the constructor arguments to match the new parameters.

fun build(screenSize: Size, aspectRatio: Rational, rotation: Int, viewFinder: TextureView): AutoFitPreviewAnalysis {
  val previewConfig = createPreviewConfig(screenSize, aspectRatio, rotation)
  val analysisConfig = createAnalysisConfig(screenSize, aspectRatio, rotation)
  return AutoFitPreviewAnalysis(previewConfig, analysisConfig, WeakReference(viewFinder))
}

In the init method, create the ImageAnalysis instance and add an accessor so you can get it to bind it to the lifecycle.

Finally, in CameraFragment, build an instance of AutoFitPreviewAnalysis and bind the created use-cases to the fragment’s lifecycle:

private fun setUpCameraX() {
  ...
  val autoFitPreviewAnalysis = AutoFitPreviewAnalysis.build(screenSize, aspectRatio, rotation, cameraView)
  
  CameraX.bindToLifecycle(this, autoFitPreviewAnalysis.previewUseCase, autoFitPreviewAnalysis.analysisUseCase)
}

At this point, you should be able to run your app and see the preview coming from your selfie camera. Next, we’ll add the image analysis!

Add Firebase ML Kit

We first need to add the Firebase ML Kit dependencies to the project, and set up our project with Firebase.

Add the following dependencies to your app/build.gradle file:

implementation 'com.google.firebase:firebase-core:16.0.9'
implementation 'com.google.firebase:firebase-ml-vision:20.0.0'
implementation 'com.google.firebase:firebase-ml-vision-face-model:17.0.2'

Create a new project in the Firebase console, and follow the directions to register your Android app with the Firebase service.

Create the analyzer

To actually run the analysis, we’ll need to define a class that implements the ImageAnalysis.Analyzer interface. Our class, FaceAnalyzer, will encapsulate the ML Kit logic and pass the results to the view to be rendered. We’ll start with a bare-bones implementation, then optimize a bit from there: 

private class FaceAnalyzer : ImageAnalysis.Analyzer {
  private val faceDetector: FirebaseVisionFaceDetector by lazy {
    val options = FirebaseVisionFaceDetectorOptions.Builder()
      .setContourMode(FirebaseVisionFaceDetectorOptions.ALL_CONTOURS)
      .build()
    FirebaseVision.getInstance().getVisionFaceDetector(options)
  }
  private val successListener = OnSuccessListener<List<FirebaseVisionFace>> { faces ->
    Log.e("FaceAnalyzer", "Analyzer detected faces with size: " + faces.size)
  }
  private val failureListener = OnFailureListener { e ->
    Log.e("FaceAnalyzer", "Face analysis failure.", e)
  }
  override fun analyze(image: ImageProxy?, rotationDegrees: Int) {
    if (image == null) return
    val cameraImage = image.image ?: return
    val firebaseVisionImage = FirebaseVisionImage.fromMediaImage(cameraImage, getRotationConstant(rotationDegrees))
    val result = faceDetector.detectInImage(firebaseVisionImage)
      .addOnSuccessListener(successListener)
      .addOnFailureListener(failureListener)
  }
  private fun getRotationConstant(rotationDegrees: Int): Int {
    return when (rotationDegrees) {
      90 -> FirebaseVisionImageMetadata.ROTATION_90
      180 -> FirebaseVisionImageMetadata.ROTATION_180
      270 -> FirebaseVisionImageMetadata.ROTATION_270
      else -> FirebaseVisionImageMetadata.ROTATION_0
    }
  }
}

We’ll first define the set up logic for our face detector. We only care about getting the contours, so we use the ALL_CONTOURS option and get the detector using the static FirebaseVision method.

We then define a success and failure listener for our detector. Right now, these will just log messages about the result, but we’ll add more to these later.

The key method is analyze(), overridden from the ImageAnalysis.Analyzer interface. This method will be called by CameraX’s analysis use-case with every frame detected by the camera. This frame is wrapped in an ImageProxy, so we ensure that we have data, then use the resulting image with FirebaseVisionImage.fromMediaImage() to construct an image object suitable to be analyzed with Firebase.

The other parameter that CameraX gives us in the analysis pathway is the degrees of rotation of the analysis image. This is useful for optimized analysis: computer vision algorithms are typically more accurate when the items being analyzed are in the expected orientation. Conveniently, fromMediaImage() takes a rotation parameter for exactly this purpose – we just need to transform it from degrees to FirebaseVisionImageMetadata constants using a small helper method, getRotationConstant().

Once we have the FirebaseVisionImage parameter built, we can pass it to our faceDetector for analysis, along with the success and failure listeners.

To add our new FaceAnalyzer to our CameraX image analysis use-case, we just need to assign it to our ImageAnalysis object after we create it:

analysisUseCase = ImageAnalysis(analysisConfig).apply {
  analyzer = FaceAnalyzer()
}

Optimize

If you run the app now, it will show the preview and log that it found a face if you point it at yourself, but it’s really slow. That’s because we’re currently doing everything on the main thread, and we’re attempting to analyze most frames, even if the current analysis isn’t finished yet. Let’s fix that.

First, we’ll create a new thread to handle the analysis use-case, and apply it to the ImageAnalysisConfig when we create it:

return ImageAnalysisConfig.Builder().apply {
  ...
  val analysisThread = HandlerThread("FaceDetectionThread").apply { start() }
  setCallbackHandler(Handler(analysisThread.looper))
}.build()

Setting the callback handler in this manner instructs ImageAnalysis to invoke the ImageAnalysis.Analyzer.analyze() method on this thread instead of the main thread, which will greatly help with our UI jank.

In addition to this, we can limit the number of frames that the analysis gets run on. Our original setting of ImageAnalysis.ImageReaderMode.ACQUIRE_LATEST_IMAGE will help with this a little bit. However, because faceDetector.detectInImage() operates asynchronously and a new analysis pass will begin almost as soon as the previous invocation of analyze() returns, it’s likely that many analysis passes will be rapidly stacked and eventually cause an OutOfMemoryError.

To prevent this, we’ll add a simple atomic flag in FaceAnalyzer that causes the analyze() method to return early if a previous analysis is still in progress, causing that frame to be skipped:

private var isAnalyzing = AtomicBoolean(false)
private val successListener = OnSuccessListener<List<FirebaseVisionFace>> { faces ->
  isAnalyzing.set(false)
  Log.e("FaceAnalyzer", "Analyzer detected faces with size: ${faces.size}")
}
private val failureListener = OnFailureListener { e ->
  isAnalyzing.set(false)
  Log.e("FaceAnalyzer", "Face analysis failure.", e)
}
override fun analyze(image: ImageProxy?, rotationDegrees: Int) {    
  if (isAnalyzing.get()) return
  isAnalyzing.set(true)
  ...
}

Add an overlay to the preview

To draw the contour points on top of the preview View, we need to add another view. Since we need this new view to draw a series of arbitrary points, a custom view is going to be our best bet here: 

class FacePointsView @JvmOverloads constructor(
  context: Context,
  attrs: AttributeSet? = null,
  defStyleAttr: Int = -1
) : View(context, attrs, defStyleAttr) {
  private val pointPaint = Paint(Paint.ANTI_ALIAS_FLAG).apply {
    color = Color.BLUE
    style = Paint.Style.FILL
  }
  var points = listOf<PointF>()
    set(value) {
      field = value
      invalidate()
    }
  override fun onDraw(canvas: Canvas) {
    super.onDraw(canvas)
    canvas.apply {
      for (point in points) {
        drawCircle(point.x, point.y, 8f, pointPaint)
      }
    }
  }
}

Our new FacePointsView will simply subclass View, and instantiate a Paint with which to draw the points. In the onDraw(), all we need to do is iterate over the points list and draw each point as a small circle. When the points variable is set to something new, the view is invalidated so the points get re-drawn.

Add this new view in the fragment’s layout in the hierarchy so it’s drawn on top of the camera preview view:

<TextureView
    android:id="@+id/camera_view"
    android:layout_width="match_parent"
    android:layout_height="match_parent" />

<com.bignerdranch.cameraxmlkitblog.FacePointsView
  android:id="@+id/face_points_view"
  android:layout_width="match_parent"
  android:layout_height="match_parent" />

Draw the face points

Now that we have a view that will draw our points, we need to get those points to the view. In the FaceAnalyzer, add a lambda property that will serve as a listener, and grab the points from the successful analysis and pass them to the lambda: 

var pointsListListener: ((List<PointF>) -> Unit)? = null
private val successListener = OnSuccessListener<List<FirebaseVisionFace>> { faces ->
  isAnalyzing.set(false)
  val points = mutableListOf<PointF>()
  for (face in faces) {
    val contours = face.getContour(FirebaseVisionFaceContour.ALL_POINTS)
    points += contours.points.map { PointF(it.x, it.y) }
  }
  pointsListListener?.invoke(points)
}

To get the points to the view, the AutoFitPreviewAnalysis will need a reference to the FacePointsView. Add a parameter to the build() function, then create a WeakReference to it and pass it to the constructor:

fun build(
  screenSize: Size, 
  aspectRatio: Rational, 
  rotation: Int, 
  viewFinder: TextureView, 
  overlay: FacePointsView
): AutoFitPreviewAnalysis {
  val previewConfig = createPreviewConfig(screenSize, aspectRatio, rotation)
  val analysisConfig = createAnalysisConfig(screenSize, aspectRatio, rotation)
  return AutoFitPreviewAnalysis(
    previewConfig, 
    analysisConfig, 
    WeakReference(viewFinder), 
    WeakReference(overlay)
  )
}

Update the constructor to match the above signature, and grab a reference to the FacePointsView in CameraFragment and pass it to AutoFitPreviewAnalysis.build().

Finally, when associating the FaceAnalyzer with the ImageAnalysis, set the analyzer’s pointsListListener to a lambda passing the points to the view:

analysisUseCase = ImageAnalysis(analysisConfig).apply {
  analyzer = FaceAnalyzer().apply {
    pointsListListener = { points ->
      overlayRef.get()?.points = points
    }
  }
}

Now if you run the app and point the camera at your face, you should see the points of a face drawn on the screen. It won’t be matched to your face yet, but you can move your mouth and watch the other face mirror your expression. Cool… or creepy?

Match the points to the preview

Since AutoFitPreviewAnalysis.updateTransform() takes the SurfaceTexture returned from the Preview use-case and transforms it to fit on the screen, the face contour points don’t match the preview image. We need to add a similar transform to the face points so the points will match.

The way the preview transform works is via matrix multiplication, which is a computationally efficient way of changing a large number of points at once. For our contour points, we’ll need to apply three transformations: scale, translation, and mirror. What we’ll do is construct a single matrix that represents the combination of all three transformations.

Initial set up

First, let’s add some additional cached values to work with. You’ll notice that at the top of AutoFitPreviewAnalysis, values for various dimensions and sizes are being cached for use in the transform calculation. To these values, add two more dimension caches:

/** Internal variable used to keep track of the image analysis dimension */
private var cachedAnalysisDimens = Size(0, 0)
/** Internal variable used to keep track of the calculated dimension of the preview image */
private var cachedTargetDimens = Size(0, 0)

cachedAnalysisDimens represents the size of the analysis image that CameraX’s analysis use case returns to FaceAnalyzer.analyze(), so we can add another callback lambda to send this value back to AutoFitPreviewAnalysis:

private class FaceAnalyzer : ImageAnalysis.Analyzer {
  var analysisSizeListener: ((Size) -> Unit)? = null
  override fun analyze(image: ImageProxy?, rotationDegrees: Int) {
    val cameraImage = image?.image ?: return
    analysisSizeListener?.invoke(Size(image.width, image.height))
    ...
  }

Then we can cache this value by setting a listener when we construct the FaceAnalyzer:

analyzer = FaceAnalyzer().apply {
  ...
  analysisSizeListener = {
    updateOverlayTransform(overlayRef.get(), it)
  }
}

The second new cached value, cachedTargetDimens, is the calculated size of the preview image. This is different from the viewFinderDimens value, since that measures the size of the view itself. For example, in the image above, viewFinderDimens.width includes the width of the white bars on the left and right of the image, whereas cachedTargetDimens.width is only the width of the image.

AutoFitPreviewAnalysis.updateTransform() already calculates this value as scaledWidth and scaledHeight to transform the preview, so all we need to do is store it for use with the overlay’s transform computation: 

// save the scaled dimens for use with the overlay
cachedTargetDimens = Size(scaledWidth, scaledHeight)

After we have these values set up, create a new method that we’ll use to build our transformation matrix for the overlay points:

private fun overlayMatrix(): Matrix {
  val matrix = Matrix()
  return matrix
}

Scale

The first transform we’ll need to apply is scaling – we want the points from the image analysis to occupy the same space as the preview image.

Since we set CameraX to use the same aspect ratio for both the preview and image analysis use-cases, we can use the same scale factor for both width and height. We also know the dimensions that we’d like to match: cachedTargetDimens, and we know our starting dimensions: cachedAnalysisDimens, so it’s just a matter of calculating the percent scale:

val scale = cachedTargetDimens.height.toFloat() / cachedAnalysisDimens.width.toFloat()

Note that we’re comparing the target’s height to the analysis’s width. This is because of how the target dimensions are calculated in updateTransform(), as well as how the camera defines its image’s dimensions by default. The target dimensions are calculated to always match a phone in portrait, regardless of true orientation – the long side is the height. The camera’s dimensions are defined in the opposite way – the long side is always the width, regardless of true orientation. Since we want the long sides from the analysis to match the long side of the preview, we just switch them when doing the calculation for the scale.

Now that we have the scale factor, use it to alter our identity matrix.

matrix.preScale(scale, scale)

Translation

Since it’s possible for the preview image to get letterboxed on the sides, depending on the phone size and orientation, we need to build a way to move our set of contour points to have the same origin as the preview image.

To do so, we need to calculate the difference between the view’s width and the target width (the width of the actual preview image displayed). Note, however, that viewFinderDimens are not independent of rotation, like cachedTargetDimens are. Therefore, we need to determine which orientation the phone is currently in, and find the difference between the corresponding sides in that orientation:

val xTranslate: Float
val yTranslate: Float
if (viewFinderDimens.width > viewFinderDimens.height) {
  // portrait: viewFinder width corresponds to target height
  xTranslate = (viewFinderDimens.width - cachedTargetDimens.height) / 2f
  yTranslate = (viewFinderDimens.height - cachedTargetDimens.width) / 2f
} else {
  // landscape: viewFinder width corresponds to target width
  xTranslate = (viewFinderDimens.width - cachedTargetDimens.width) / 2f
  yTranslate = (viewFinderDimens.height - cachedTargetDimens.height) / 2f
}

Once we’ve calculated the distance in each axis to translate the points by, apply it to the matrix.

matrix.postTranslate(xTranslate, yTranslate)

Mirror

Since we’re using the front camera, CameraX preview flips this image so the image that appears on screen looks like one that would appear if you were looking in a mirror. Image analysis doesn’t do this flip for us, so we have to mirror it ourselves.

Fortunately, mirror is the easy one. It’s just a scale transform of -1 in the x-direction. First, calculate the center of the image, then scale it around that point:

val centerX = viewFinderDimens.width / 2f
val centerY = viewFinderDimens.height / 2f
matrix.postScale(-1f, 1f, centerX, centerY)

Use the matrix

Now that our overlayMatrix() function returns a matrix that encapsulates all the transformations that we need for our face map, let’s apply it to the points in the map. Add another member variable to the FacePointsView class to store the updated matrix:

var transform = Matrix()

Now we’ll add a method to apply this transform matrix to the list of points. The key method we’ll be building this around is Matrix.mapPoints(dst: FloatArray, src: FloatArray). For every pair of points in the input array (src), the matrix multiplies the pair by itself, producing a new pair that is mapped to its position in the transformed space. These mapped points are copied to the output array (dst) in the same order.

For the code, add a private method and create FloatArrays for the input and output, and pass them to mapPoints(). Then convert the output FloatArray back into a List<Point> that we can use with our existing logic in onDraw():

private fun transformPoints() {
  // build src and dst
  val transformInput = points.flatMap { listOf(it.x, it.y) }.toFloatArray()
  val transformOutput = FloatArray(transformInput.size)
  // apply the matrix transformation
  transform.mapPoints(transformOutput, transformInput)
  // convert transformed FloatArray to List<Point>
  drawingPoints = transformOutput.asList()
      .chunked(size = 2, transform = { (x, y) -> PointF(x, y) })
}

Note that drawingPoints hasn’t been defined – that’s because we’ll need that to be a member variable so it’s available to our onDraw(). Let’s add that now.

private var drawingPoints = listOf<PointF>()

Now our FacePointsView has everything it needs to draw the points in the correct position over the camera preview image.

Draw the face points… again, but better!

Currently, we’re calling invalidate() on the FacePointsView whenever we get a new set of points from the analyzer. Now that we’ve added our transformation matrix, we’d like those points to first be transformed before any drawing occurs. We’ll also need the previously stored points to be transformed again if the matrix changes. Let’s change the setters of both points and transform to achieve this:

var points = listOf<PointF>()
  set(value) {
    field = value
    transformPoints()
  }
var transform = Matrix()
  set(value) {
    field = value
    transformPoints()
  }

Whenever either of these variables is changed, we call transformPoints(), which uses the current values of points and transform to create a new drawingPoints list.

We then need to change our onDraw() method to draw from the points in drawingPoints, instead of from points:

canvas.apply {
-   for (point in points) {
+   for (point in drawingPoints) {
      drawCircle(point.x, point.y, 8f, pointPaint)
    }
  }

Finally, the only thing that remains is to tell our view to redraw every time the drawingPoints list gets updated. Make a custom setter and call invalidate() in it to achieve this:

private var drawingPoints = listOf<PointF>()
  set(value) {
    field = value
    invalidate()
  }

Now if you run the app, you should see the points overlaid on the image of your face. When you move your face, the points should follow!

Sample project

See the sample project for the full code.

Not Happy with Your Current App, or Digital Product?

Submit your event

Let's Discuss Your Project

Let's Discuss Your Project