会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 68. 发明授权
    • System and method for detecting deception in an audio-video response of a user
    • US11151385B2
    • 2021-10-19
    • US16722083
    • 2019-12-20
    • RTScaleAI Inc
    • Vivek IyerPeter Walker
    • G06K9/00G10L15/02G10L15/22G10L15/18G06K9/32G06N5/04G10L21/0232G10L25/63G10L25/90G06K9/62G06N20/00
    • A method for (of) detecting deception in an Audio-Video response of a user, using a server, in a distributed computing architecture, characterized in that the method including: enabling an Audio-Video connection with a user device upon receiving a request from a user; obtaining, from the user device, an Audio-Video response of the user corresponding to a first set of questions that are provided to the user by the server; extracting audio signals and video signals from the Audio-Video response; detecting an activity of the user by determining a plurality of Natural Language Processing (NLP) features from the extracted audio signals by (i) performing a speech to text translation and (ii) extracting the plurality of NLP features from the translated text, and determining a plurality of speech features from the extracted audio signals by (i) splitting the extracted audio signals into a plurality of short interval audio signals and (ii) extracting the plurality of speech features from the plurality of short interval audio signals; aggregating (i) the plurality of NLP features to obtain a plurality of temporal NLP features and (ii) the plurality of speech features to obtain a plurality of temporal speech features; aggregating the plurality of temporal NLP features and the plurality of temporal speech features to obtain first temporal aggregated features; detecting a plurality of micro-expressions of the user by splitting extracted video signals into a plurality of short fixed-duration video signals, detecting a plurality of Region Of Interest (ROI) in the plurality of short fixed-duration video signals, and comparing the plurality of detected ROI with video signals annotated with micro-expression labels that are stored in a database to detect the plurality of micro-expressions of the user in the plurality of short fixed-duration video signals; tracking and determining a gesture of the user from the extracted video signals; aggregating the plurality of micro-expressions and the gesture of the user to obtain second temporal aggregated features; aggregating the first temporal aggregated features and the second temporal aggregated features to obtain final temporal aggregated features; and detecting, using a machine learning model, a deception in the Audio-Video response based on the final temporal aggregated features.