【Android代码】绘本翻页时通过AI识别,自动通过手机/pad朗读绘本

核心功能:

  • 打开摄像头(可支持外接摄像头)
  • 检测翻页(后续考虑添加图像差异算法)
  • 拍照后用 识图
  • 自动用 TextToSpeech 朗读文字内容

📌 说明:

  • 使用了 CameraX(Android Jetpack)处理摄像头输入

  • 使用 ML Kit 做文字识别

  • 使用 TextToSpeech 实现朗读

java 复制代码
// Project: StoryBookReaderApp
// MainActivity.java --- Android Studio 项目主类

package com.example.storybookreaderapp;

import android.Manifest;
import android.content.pm.PackageManager;
import android.os.Bundle;
import android.speech.tts.TextToSpeech;
import android.util.Log;
import android.view.SurfaceView;
import android.widget.Toast;

import androidx.annotation.NonNull;
import androidx.appcompat.app.AppCompatActivity;
import androidx.camera.core.CameraSelector;
import androidx.camera.core.ImageAnalysis;
import androidx.camera.core.ImageCapture;
import androidx.camera.core.ImageCaptureException;
import androidx.camera.core.ImageProxy;
import androidx.camera.lifecycle.ProcessCameraProvider;
import androidx.core.app.ActivityCompat;
import androidx.core.content.ContextCompat;

import com.google.common.util.concurrent.ListenableFuture;
import com.google.mlkit.vision.common.InputImage;
import com.google.mlkit.vision.text.TextRecognition;
import com.google.mlkit.vision.text.TextRecognizer;

import java.util.Locale;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;

public class MainActivity extends AppCompatActivity {
    private static final int REQUEST_CODE_PERMISSIONS = 10;
    private static final String[] REQUIRED_PERMISSIONS = new String[]{Manifest.permission.CAMERA};

    private ExecutorService cameraExecutor;
    private ImageCapture imageCapture;
    private TextToSpeech tts;

    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        SurfaceView view = new SurfaceView(this);
        setContentView(view);

        if (allPermissionsGranted()) {
            startCamera();
        } else {
            ActivityCompat.requestPermissions(this, REQUIRED_PERMISSIONS, REQUEST_CODE_PERMISSIONS);
        }

        cameraExecutor = Executors.newSingleThreadExecutor();

        tts = new TextToSpeech(this, status -> {
            if (status != TextToSpeech.ERROR) {
                tts.setLanguage(Locale.US);
            }
        });
    }

    private boolean allPermissionsGranted() {
        for (String permission : REQUIRED_PERMISSIONS) {
            if (ContextCompat.checkSelfPermission(this, permission) != PackageManager.PERMISSION_GRANTED) {
                return false;
            }
        }
        return true;
    }

    private void startCamera() {
        ListenableFuture<ProcessCameraProvider> cameraProviderFuture = ProcessCameraProvider.getInstance(this);
        cameraProviderFuture.addListener(() -> {
            try {
                ProcessCameraProvider cameraProvider = cameraProviderFuture.get();

                imageCapture = new ImageCapture.Builder().build();

                ImageAnalysis imageAnalysis = new ImageAnalysis.Builder().build();
                imageAnalysis.setAnalyzer(cameraExecutor, image -> {
                    detectPageTurn(image);
                    image.close();
                });

                CameraSelector cameraSelector = CameraSelector.DEFAULT_BACK_CAMERA;

                cameraProvider.unbindAll();
                cameraProvider.bindToLifecycle(this, cameraSelector, imageCapture, imageAnalysis);

            } catch (Exception e) {
                Log.e("CameraX", "Binding failed", e);
            }
        }, ContextCompat.getMainExecutor(this));
    }

    private void detectPageTurn(ImageProxy image) {
        takePhotoAndRead();
    }

    private void takePhotoAndRead() {
        imageCapture.takePicture(ContextCompat.getMainExecutor(this), new ImageCapture.OnImageCapturedCallback() {
            @Override
            public void onCaptureSuccess(@NonNull ImageProxy image) {
                InputImage inputImage = InputImage.fromMediaImage(image.getImage(), image.getImageInfo().getRotationDegrees());
                TextRecognizer recognizer = TextRecognition.getClient();

                recognizer.process(inputImage)
                        .addOnSuccessListener(result -> {
                            String text = result.getText();
                            tts.speak(text, TextToSpeech.QUEUE_FLUSH, null, null);
                        })
                        .addOnFailureListener(e -> Toast.makeText(MainActivity.this, "Text recognition failed", Toast.LENGTH_SHORT).show());
                image.close();
            }

            @Override
            public void onError(@NonNull ImageCaptureException exception) {
                Log.e("CameraX", "Capture failed", exception);
            }
        });
    }

    @Override
    protected void onDestroy() {
        if (tts != null) {
            tts.stop();
            tts.shutdown();
        }
        cameraExecutor.shutdown();
        super.onDestroy();
    }

    @Override
    public void onRequestPermissionsResult(int requestCode, @NonNull String[] permissions, @NonNull int[] grantResults) {
        if (requestCode == REQUEST_CODE_PERMISSIONS) {
            if (allPermissionsGranted()) {
                startCamera();
            } else {
                Toast.makeText(this, "Permissions not granted", Toast.LENGTH_SHORT).show();
                finish();
            }
        }
    }
} // End of class
相关推荐
孤狼灬笑14 小时前
机器学习四范式(有监督、无监督、强化学习、半监督学习)
人工智能·强化学习·无监督学习·半监督学习·有监督学习
第七序章14 小时前
【C++】AVL树的平衡机制与实现详解(附思维导图)
c语言·c++·人工智能·机器学习
爱吃水蜜桃的奥特曼14 小时前
玩Android Flutter版本,通过项目了解Flutter项目快速搭建开发
android·flutter
晨非辰14 小时前
【面试高频数据结构(四)】--《从单链到双链的进阶,读懂“双向奔赴”的算法之美与效率权衡》
java·数据结构·c++·人工智能·算法·机器学习·面试
阿里云大数据AI技术14 小时前
云栖实录 | 通义实验室基于MaxCompute进行大模型数据管理及处理
大数据·人工智能
玉树临风江流儿14 小时前
关于pkg-config的使用示例--g++编译过程引入第三方库(如Opencv、Qt)
人工智能·opencv
struggle202514 小时前
AxonHub 开源程序是一个现代 AI 网关系统,提供统一的 OpenAI、Anthropic 和 AI SDK 兼容 API
css·人工智能·typescript·go·shell·powershell
后端小肥肠14 小时前
公众号对标账号文章总错过?用 WeWe-RSS+ n8n,对标文章定时到你的邮箱(下篇教程)
人工智能·agent
Gloria_niki14 小时前
目标检测学习总结
人工智能·计算机视觉·目标跟踪