Building ML-powered cross-platform apps with Flutter and Google's ML Kit!

Building ML-powered cross-platform apps with Flutter and Google's ML Kit!

This article will help you take advantage of Google's ML Kit to integrate ML into Flutter applications!

ยท

13 min read

Introduction

Hey guys, it's been a while!

I've been looking into building some ML-powered cross-platform applications using Flutter. There are tons of tutorials online using TensorFlow Lite (using the tflite package).

That's all great, except for the fact that the package seems to be abandoned. The repository is filled with issues requesting the author to upgrade the package.

The good news is that we have alternatives to this package such as the tflite_flutter package which was recently confirmed to be officially taken over by the TensorFlow team.

In this article, however, we shall take advantage of Google's ML Kit and use google_mlkit_object_detection to develop an application that can detect objects.

What is this?

Google's ML Kit is essentially a package that helps you integrate different ML-related APIs into your Flutter application.

You can perform a bunch of operations such as face detection, pose detection, object detection, language identification, and many more!

Some ML Kit APIs even allow you to use custom TensorFlow Lite models!

You can find out more about this package on their GitHub.

What are we building?

Great, so now that we know what Google ML kit is used for, let's understand what we're going to build.

This is going to be a simple example to demonstrate Object Detection. However, this app will give you a great idea of how to implement this yourself and possibly extend it to create interesting applications, powered by ML!

As you can see in the video above, we will try to display the label of the object as well as the confidence threshold (which basically gives us an idea of how accurate the model thinks it is for that object)

In this example application, I have used a custom tflite model - but I will also give you a quick overview of how you can just use the base model as well!

Diving into the code!

Alright, time for the fun part!๐Ÿ˜‰

Before I begin explaining the code, you can find the entire source code at my GitHub.

Core functions

Alright, let's go over the critical functions of the application. This is probably all you need to refer to get started with using this package.

I will also explain how to use CustomPainter to paint rectangles over the detected objects and of course, go through the UI as well.

main()


void main() async {
  WidgetsFlutterBinding.ensureInitialized();
  cameras = await availableCameras();

  runApp(const MyApp());
}

This is the main function of the Flutter app.

We use WidgetsFlutterBinding.ensureInitialized() to ensure that the binding is initialized before anything else is executed. This method must be called before using any Flutter classes or methods.

cameras is a global List that will store the available cameras in a device that we can fetch using availableCameras() function of the camera package.

_getModel()

  Future<String> _getModel(String assetPath) async {
    if (Platform.isAndroid) {
      return 'flutter_assets/$assetPath';
    }
    final path = '${(await getApplicationSupportDirectory()).path}/$assetPath';
    await Directory(dirname(path)).create(recursive: true);
    final file = File(path);
    if (!await file.exists()) {
      final byteData = await rootBundle.load(assetPath);
      await file.writeAsBytes(byteData.buffer
          .asUint8List(byteData.offsetInBytes, byteData.lengthInBytes));
    }
    return file.path;
  }

This function is used to fetch the path to the custom TFLite model that we want to use in our application.

You do not need this if you're intending to work with the base model itself.

To get started, you need to store the model in a folder in the root directory, and don't forget to add it to your pubspec.yaml

The function first checks if the platform is Android to return the asset path with a prefix of flutter_assets/. This is because, on Android, assets are stored in the flutter_assets directory.

If the platform is not Android, we need to use getApplicationSupportDirectory() from path_provider package to get the application support directory. It then appends the asset path to the end of the application support directory to create the full path to the model file.

Next, the function creates a directory using the Directory class to store the model file anddirname() from the path package is used to get the directory name from the full path to the model file. The recursive parameter is set true so that any missing directories in the path will be created.

After creating the directory, the function creates an File object using the full path to the model file. If the file does not exist, the function loads the model file as byte data using the rootBundle.load() method. The byte data is then written to the file using the writeAsBytes() method.

initModel()

  initModel() async {
    final modelPath = await _getModel('assets/ml/model.tflite');

    final options = LocalObjectDetectorOptions(
        modelPath: modelPath,
        classifyObjects: true,
        multipleObjects: true,
        mode: DetectionMode.stream,
        confidenceThreshold: 0.5);
    objectDetector = ObjectDetector(options: options);
  }

This function is used to initialize the objectDetector object.

To initialize it, we need to pass ObjectDetector the constructor an options parameter.

Here is where the key difference between using a custom model and base model comes in.

Custom TFLite Model

If you're intending to use a custom model, then, you need to find the path to the model using _getModel().

Then, we shall use LocalObjectDetectorOptions class which includes the path to the model file, whether to classify objects, whether to detect multiple objects, the detection mode, and the confidence threshold.

Base Model

If you're intending to use the base model, then you do not have to find the path to any custom model.

Instead of using LocalObjectDetectorOptions class, we shall use ObjectDetectorOptions class. The only difference is that, ObjectDetectorOptions does not require the modelPath and confidenceThreshold parameters.

initCamera()


  initCamera() async {
    controller = CameraController(cameras[0], ResolutionPreset.high);
    await controller.initialize().then((_) async {
      await startStream();
      if (!mounted) {
        return;
      }
    }).catchError((Object e) {
      if (e is CameraException) {
        switch (e.code) {
          case 'CameraAccessDenied':
            print('Camera access denied!');
            break;
          default:
            print('Camera initalization error!');
            break;
        }
      }
    });
  }

This function is used to initialize the camera controller from camera package.

We use the first camera found (stored in the global cameras variable) and set the Resolution of the camera to high.

When the camera controller has been initialized, we start the camera stream through startStream() function. This stream is what lets us have a "live preview" of the camera in our application.

startStream()

  startStream() async {
    await controller.startImageStream((image) async {
      if (!isBusy) {
        isBusy = true;
        img = image;
        await performDetectionOnFrame();
      }
    });
  }

This function basically starts the image stream through the startImageStream() function of the camera controller.

It will return us a stream of images and when we recieve an image, we set the isBusy variable to true so that we can process the image to detect the objects through performDetectionOnFrame().

performDetectionOnFrame()

  performDetectionOnFrame() async {
    InputImage frameImg = getInputImage();
    List<DetectedObject> objects = await objectDetector.processImage(frameImg);
    setState(() {
      _detectedObjects = objects;
    });
    isBusy = false;
  }

ML Kit requires images in a particular format to process it. This format is defined by the class InputImage.
The image sent from the camera controller isn't in this format. So we have to convert it using getInputImage().

Once we have converted it, we can perform object detection on it calling the processImage() function of objectDetector object. We're passing the converted image as a parameter to this function. This will give us a list of detected objects (since in a frame, we can have more than one object)

Once we get this list, we will use setState() to store this value in the class variable called _detectedObjects. We will understand later on why we have to store it in the class variable.

Once we're done with processing the image, we set the isBusy back to false so we can process the next image returned by the stream.

getInputImage()

  InputImage getInputImage() {
    final WriteBuffer allBytes = WriteBuffer();
    for (final Plane plane in img!.planes) {
      allBytes.putUint8List(plane.bytes);
    }
    final bytes = allBytes.done().buffer.asUint8List();
    final Size imageSize = Size(img!.width.toDouble(), img!.height.toDouble());
    final camera = cameras[0];

    final planeData = img!.planes.map(
      (Plane plane) {
        return InputImagePlaneMetadata(
          bytesPerRow: plane.bytesPerRow,
          height: plane.height,
          width: plane.width,
        );
      },
    ).toList();

    final inputImageData = InputImageData(
      size: imageSize,
      imageRotation:
          InputImageRotationValue.fromRawValue(camera.sensorOrientation)!,
      inputImageFormat: InputImageFormatValue.fromRawValue(img!.format.raw)!,
      planeData: planeData,
    );

    final inputImage =
        InputImage.fromBytes(bytes: bytes, inputImageData: inputImageData);

    return inputImage;
  }

This function is what converts the image received from the camera controller into a format that can be processed by ML Kit.

It was predefined in the GitHub repository of ML Kit as it is necessary to convert the image to be able to take advantage of this package.

The function first creates a WriteBuffer object to store the bytes of the image. It then iterates over the planes of the image and adds the bytes to the WriteBuffer object. The WriteBuffer object is then converted to a Uint8List and stored in the bytes variable.

Next, the method creates a Size object to store the dimensions of the image. Then, it gets the first camera from the cameras list (which was initialized in the main() function) and creates a List of InputImagePlaneMetadata objects to store metadata about each plane of the image.

Finally, the method creates an InputImageData object using the Size object, the rotation value of the camera, the input image format, and the InputImagePlaneMetadata list. It then creates an InputImage object using the bytes and InputImageData objects and returns it.

Pfft! That was intense - you can probably skip over this part if you don't really want to understand how it's converted!

Drawing rectangles over detected objects

Now that we're done with the core functions, let's understand how we can draw a rectangle (or bounding box) around a detected object.

The secret ingredient here is CustomPainter.

When I first came across this, it seemed super intimidating, so let me try and break it down to make it simpler.

CustomPainter is a class in Flutter that essentially allows you to create custom graphics on a canvas. We can draw shapes, lines or even write text!

It's a SUPER cool class in Flutter and there are some great articles online that showcase it's full potential to draw some mesmerizing graphics!

Anyway, for our application, we just need to draw a rectangle around the cordinates that we can get from the DetectedObject object stored in the class list _detectedObjects.

To use CustomPainter we need to extend the class and override the paint() method.

This method is called whenever the widget needs to be repainted, and it provides an Canvas object that you can use to draw your custom graphics on.

You can also optionally override the shouldRepaint() method to control when the widget should be repainted.

class ObjectPainter extends CustomPainter {
  ObjectPainter(this.imgSize, this.objects);

  final Size imgSize;
  final List<DetectedObject> objects;

  @override
  void paint(Canvas canvas, Size size) {
    // Using TouchyCanvas to enable interactivity
    // Calculating the scale factor to resize the rectangle (newSize/originalSize)
    final double scaleX = size.width / imgSize.width;
    final double scaleY = size.height / imgSize.height;

    final Paint paint = Paint()
      ..style = PaintingStyle.stroke
      ..strokeWidth = 10.0
      ..color = Color.fromARGB(255, 255, 0, 0);

    for (DetectedObject detectedObject in objects) {
      canvas.drawRect(
        Rect.fromLTRB(
          detectedObject.boundingBox.left * scaleX,
          detectedObject.boundingBox.top * scaleY,
          detectedObject.boundingBox.right * scaleX,
          detectedObject.boundingBox.bottom * scaleY,
        ),
        paint,
      );

      var list = detectedObject.labels;
      for (Label label in list) {
        labelString = label.text;
        confidenceString = label.confidence.toStringAsFixed(2);
      }
    }
  }

  @override
  bool shouldRepaint(ObjectPainter oldDelegate) {
    // Repaint if object is moving or new objects detected
    return oldDelegate.imgSize != imgSize || oldDelegate.objects != objects;
  }
}

In our case, I am overriding both methods.

We have defined a class called ObjectPainter which extends CustomPainter class.

The constructor will take in two parameters - imgSize and objects, which represent the size of the image and a list of detected objects respectively.

The paint() method is overridden to draw the rectangles over the detected objects. The method first calculates the scale factor to resize the rectangle based on the size of the image and the size of the canvas. It then creates a Paint object to specify the style, width, and color of the rectangle.

The method then iterates over the list of DetectedObject objects and draws a rectangle for each object using the drawRect() method of the canvas object. The rectangle is defined using the bounding box of the detected object, which is scaled using the calculated scale factors. We can get the coordinates of the bounding box from the DetectedObject class.

The method also extracts the label and confidence information for each detected object and stores it in labelString and confidenceString respectively, which are global variables.

The shouldRepaint() method is also overridden to compare the current imgSize and objects with the old delegate's imgSize and objects and returns true if they are different, which indicates that the widget should be repainted.

User Interface and putting it all together!

We're almost there! Now, we just have to create the UI, call the functions properly and draw the rectangles over the objects!

initState()

  void initState() {
    super.initState();
    initModel();
    initCamera();
  }

This should be pretty self-explanatory. initState() is the first method called when the app is initialized - so we use it to initialize the model and our camera.

When our camera is initialized, it will start a stream - where each image of the stream is sent to be processed and the detected objects are stored in _detectedObjects

We've defined a CustomPaint class to draw rectangles, but haven't called it yet.

So, let's do that!

drawRectangleOverObjects()

  Widget drawRectangleOverObjects() {
    if (_detectedObjects == null ||
        controller == null ||
        !controller.value.isInitialized) {
      return Container(
          child: Center(
        child: Column(
            mainAxisAlignment: MainAxisAlignment.center,
            children: const [
              Text('Loading...'),
              CircularProgressIndicator(),
            ]),
      ));
    }

    final Size imageSize = Size(
      controller.value.previewSize!.height,
      controller.value.previewSize!.width,
    );
    CustomPainter painter = ObjectPainter(imageSize, _detectedObjects);
    return CustomPaint(
      painter: painter,
    );
  }

This function will call our ObjectPainter class and return a CustomPaint object.

The method first checks for some conditions such as - if the controller is null (not initialized) or if no objects have been detected.

If this is true, it will display text saying "Loading..." along with a CircularProgressIndicator() formatted in a column.

If the camera is initialized and _detectedObjects is not null, the method gets the preview size of the camera controller and creates a Size object which represents the image size.

A new ObjectPainter instance is created using the imageSize and _detectedObjects as parameters and a CustomPaint widget is returned with the ObjectPainter instance as its painter.

build()

 Widget build(BuildContext context) {
    Size size = MediaQuery.of(context).size;
    // ToastContext().init(context);
    if (controller != null) {
      // stackChildren.add(Positioned(top: 0.0, left: 0.0, child: Text(text)));
      stackChildren.add(
        Positioned(
          top: 0.0,
          left: 0.0,
          width: size.width,
          height: size.height,
          child: Container(
            child: (controller.value.isInitialized)
                ? AspectRatio(
                    aspectRatio: controller.value.aspectRatio,
                    child: CameraPreview(controller),
                  )
                : Container(),
          ),
        ),
      );
      stackChildren.add(
        Positioned(
            top: 0.0,
            left: 0.0,
            width: size.width,
            height: size.height,
            child: drawRectangleOverObjects()),
      );
    }
    return Scaffold(
      appBar: AppBar(
        centerTitle: true,
        title: const Text("Object detector"),
        backgroundColor: Color.fromARGB(255, 126, 0, 252),
      ),
      backgroundColor: Colors.black,
      body: Column(
        children: [
          Expanded(
            child: Container(
                margin: const EdgeInsets.only(top: 0),
                color: Colors.black,
                child: Stack(
                  children: stackChildren,
                )),
          ),
          Container(
              color: Colors.white,
              height: MediaQuery.of(context).size.width * 0.30,
              width: MediaQuery.of(context).size.width,
              child: Column(
                mainAxisAlignment: MainAxisAlignment.center,
                children: [
                  Text('Name: $labelString',
                      style: const TextStyle(fontSize: 21)),
                  Text('Confidence: $confidenceString',
                      style: const TextStyle(fontSize: 21))
                ],
              )),
        ],
      ),
    );
  }

This method is essentially what creates the UI of the app.

The logic is simple, we're going to use a Stack to display the custom-drawn rectangles over the live image preview from the camera to show the detected objects.

It first checks if the controller is not null. If it is not null, it adds a Positioned widget containing the camera preview to the stackChildren list, defined in the class. It then adds another Positioned widget containing the detected object rectangles by calling the drawRectangleOverObjects() method.

The Scaffold widget has a Column widget as its body, which contains two widgets: a Container with the camera preview and detected object rectangles, and a Container with the label and confidence string of the detected object - which we can get from the global labelString and confidenceString variables.

The AppBar widget contains the app title and a purple background color and we set the background color of app to black.

That's it!

Conclusion

I started my Flutter journey a hobbyist and now find myself working with it almost every day!

Honestly, the Flutter community is so vast that you can pretty much do everything with Flutter - including building decentralized applications!

It was kind of a bummer when I couldn't find any good tutorial on this package and had some trouble figuring it out since it was my introduction to working with ML as well.

I'm excited to see what interesting applications will come out powered by ML and I genniunely hope this article will help y'all integrate ML into your applications!

Thanks for reading! โค๏ธ

ย