Department of Computer Science and Engineering

Permanent URI for this collection

Work by the faculty and students of the Department of Computer Science and Engineering

Browse

Recent Submissions

  • Item
    Machine Learning-based X-Ray Projection Interpolation for Improved 4D-CBCT Reconstruction
    (IEEE, 2024) Ramesh, Jayroop; Sankalpa, Donthi; Mitra, Rohan; Dhou, Salam
    Respiration-correlated cone-beam computed tomography (4D-CBCT) is an X-ray-based imaging modality that uses reconstruction algorithms to produce time-varying volumetric images of moving anatomy over a cycle of respiratory motion. The quality of the produced images is affected by the number of CBCT projections available for reconstruction. Interpolation techniques have been used to generate intermediary projections to be used, along with the original projections, for reconstruction. Transfer learning is a powerful approach that harnesses the ability to reuse pre-trained models in solving new problems. Methods: Several state-of-the-art pre-trained deep learning models, used for video frame interpolation, are utilized in this work to generate intermediary projections. Moreover, a novel regression predictive modeling approach is also proposed to achieve the same objective. Digital phantom and clinical datasets are used to evaluate the performance of the models. Results: The results show that the Real-Time Intermediate Flow Estimation (RIFE) algorithm outperforms the others in terms of the Structural Similarity Index Method (SSIM): 0.986 ± 0.010, Peak Signal to Noise Ratio (PSNR): 44.13 ± 2.76, and Mean Square Error (MSE): 18.86 ± 206.90 across all datasets. Moreover, the interpolated projections were used along with the original ones to reconstruct a 4D-CBCT image that was compared to that reconstructed from the original projections only. Conclusions: The reconstructed image using the proposed approach was found to minimize the streaking artifacts, thereby enhancing the image quality. This work demonstrates the advantage of using general-purpose transfer learning algorithms in 4D-CBCT image enhancement.
  • Item
    Hand-Crafted Features With A Simple Deep Learning Architecture For Sensor-Based Human Activity Recognition
    (IEEE, 2024-07-10) Albadawi, Yaman; Shanableh, Tamer
    With the growth in the wearable device market, wearable sensor-based human activity recognition systems have been gaining increasing interest in research because of their rising demands in many areas. This research presents a novel sensor-based human activity recognition system that utilizes a unique feature extraction technique associated with a deep learning method for classification. One of the main contributions of this work is dividing the sensor sequences time-wise into non-overlapping 2D segments. Then, statistical features are computed from each 2D segment using two approaches; the first approach computes features from the raw sensor readings, while the second approach applies time-series differencing to sensor readings prior to feature calculations. Applying time-series differencing to 2D segments helps in identifying the underlying structure and dynamics of the sensor reading across time. This work experiments with different numbers of 2D segments of sensor reading sequences. Also, it reports results with and without the use of different components of the proposed system. Additionally, it analyses the best-performing models’ complexity, comparing them with other models trained by integrating the proposed method with an existing transformer network. All of these arrangements are tested with different deep-learning architectures supported by an attention layer to enhance the model. Four benchmark datasets are used to perform several experiments, namely, mHealth, USC-HAD, UCI-HAR, and DSA. The experimental results revealed that the proposed system outperforms human activity recognition rates reported in the most recent studies. Specifically, this work reports recognition rates of 99.17%, 81.07%, 99.44%, and 94.03% for the four datasets, respectively.
  • Item
    Quantifying day-to-day variations in 4DCBCT-based PCA motion models
    (IOP Science, 2020) Dhou, Salam; Lewis, John; Cai, Weixing; Ionascu, Dan; Williams, Christopher
    The aim of this paper is to quantify the day-to-day variations of motion models derived from pre-treatment 4-dimensional cone beam CT (4DCBCT) fractions for lung cancer stereotactic body radiotherapy (SBRT) patients. Motion models are built by 1) applying deformable image registration (DIR) on each 4DCBCT image with respect to a reference image from that day, resulting in a set of displacement vector fields (DVFs), and 2) applying principal component analysis (PCA) on the DVFs to obtain principal components representing a motion model. Variations were quantified by comparing the PCA eigenvectors of the motion model built from the first day of treatment to the corresponding eigenvectors of the other motion models built from each successive day of treatment. Three metrics were used to quantify the variations: root mean squared (RMS) difference in the vectors, directional similarity, and an introduced metric called the Euclidean Model Norm (EMN). EMN quantifies the degree to which a motion model derived from the first fraction can represent the motion models of subsequent fractions. Twenty-one 4DCBCT scans from five SBRT patient treatments were used in this retrospective study. Experimental results demonstrated that the first two eigenvectors of motion models across all fractions have smaller RMS (0.00017), larger directional similarity (0.528), and larger EMN (0.678) than the last three eigenvectors (RMS: 0.00025, directional similarity: 0.041, and EMN: 0.212). The study concluded that, while the motion model eigenvectors varied from fraction to fraction, the first few eigenvectors were shown to be more stable across treatment fractions than others. This supports the notion that a pre-treatment motion model built from the first few PCA eigenvectors may remain valid throughout a treatment course. Future work is necessary to quantify how day-to-day variations in these models will affect motion reconstruction accuracy for specific clinical tasks.
  • Item
    LungVision: X-ray Imagery Classification for On-Edge Diagnosis Applications
    (MDPI, 2024) Aldamani, Raghad; Abuhani, Diaa Addeen; Shanableh, Tamer
    This study presents a comprehensive analysis of utilizing TensorFlow Lite on mobile phones for the on-edge medical diagnosis of lung diseases. This paper focuses on the technical deployment of various deep learning architectures to classify nine respiratory system diseases using X-ray imagery. We propose a simple deep learning architecture that experiments with six different convolutional neural networks. Various quantization techniques are employed to convert the classification models into TensorFlow Lite, including post-classification quantization with floating point 16 bit representation, integer quantization with representative data, and quantization-aware training. This results in a total of 18 models suitable for on-edge deployment for the classification of lung diseases. We then examine the generated models in terms of model size reduction, accuracy, and inference time. Our findings indicate that the quantization-aware training approach demonstrates superior optimization results, achieving an average model size reduction of 75.59%. Among many CNNs, MobileNetV2 exhibited the highest performance-to-size ratio, with an average accuracy loss of 4.1% across all models using the quantization-aware training approach. In terms of inference time, TensorFlow Lite with integer quantization emerged as the most efficient technique, with an average improvement of 1.4 s over other conversion approaches. Our best model, which used EfficientNetB2, achieved an F1-Score of approximately 98.58%, surpassing state-of-the-art performance on the X-ray lung diseases dataset in terms of accuracy, specificity, and sensitivity. The model experienced an F1 loss of around 1% using quantization-aware optimization. The study culminated in the development of a consumer-ready app, with TensorFlow Lite models tailored to mobile devices.
  • Publication
    Two-Stage Deep Learning Solution for Continuous Arabic Sign Language Recognition Using Word Count Prediction and Motion Images
    (IEEE, 2023) Shanableh, Tamer
    Recognition of continuous sign language is challenging as the number of words is a sentence and their boundaries are unknown during the recognition stage. This work proposes a two-stage solution in which the number of words in a sign language sentence is predicted in the first stage. The sentence is then temporally segmented accordingly and each segment is represented in a single image using a novel solution that entails summation of frame differences using motion estimation and compensation. This results in a single image representation per sign language word referred to as a motion image. CNN transfer learning is used to convert each of these motion images into a feature vector which is used for either model generation or sign language recognition. As such, two deep learning models are generated; one for predicting the number of words per sentence and the other for recognizing the meaning of the sign language sentences. The proposed solution of predicting the number of words per sentence and thereafter segmenting the sentence into equal segments worked well. This is because each motion image can contain traces of previous or successive words. This byproduct of the proposed solution is advantageous as it puts words into context, thus justifying the excellent sign language recognition rates reported. It is shown that bidirectional LSTM layers result in the most accurate models for both stages. In the experimental results section we use an existing dataset that contains 40 sentences generated from 80 sign language words. The experiments revealed that the proposed solution resulted in a word and sentence recognition rates of 97.3% and 92.6% respectively. The percentage increase over the best results reported in the literature for the same dataset are 1.8% and 9.1% for both word and sentences recognitions respectively.
  • Publication
    Video-Based Recognition of Human Activity Using Novel Feature Extraction Techniques
    (MDPI, 2023-06-05) Issa, Obada; Shanableh, Tamer
    This paper proposes a novel approach to activity recognition where videos are compressed using video coding to generate feature vectors based on compression variables. We propose to eliminate the temporal domain of feature vectors by computing the mean and standard deviation of each variable across all video frames. Thus, each video is represented by a single feature vector of 67 variables. As for the motion vectors, we eliminated their temporal domain by projecting their phases using PCA, thus representing each video by a single feature vector with a length equal to the number of frames in a video. Consequently, complex classifiers such as LSTM can be avoided and classical machine learning techniques can be used instead. Experimental results on the JHMDB dataset resulted in average classification accuracies of 68.8% and 74.2% when using the projected phases of motion vectors and video coding feature variables, respectively. The advantage of the proposed solution is the use of FVs with low dimensionality and simple machine learning techniques.
  • Publication
    Static Video Summarization Using Video Coding Features with Frame-level Temporal Sub-Sampling and Deep Learning
    (MDPI, 2023) Issa, Obada; Shanableh, Tamer
    There is an abundance of digital video content due to the cloud’s phenomenal growth and security footage, it is therefore essential to summarize these videos in data centers. This paper offers innovative approaches to the problem of key-frame extraction for the purpose of video summarization. Our approach includes feature variables extracted from the bit streams of coded videos, followed by optional stepwise regression for dimensionality reduction. Once the features are extracted and reduced in dimensionality, we apply innovate frame-level temporal sub-sampling techniques followed by training and testing using deep learning architectures. The frame-level temporal subsampling techniques are based on cosine similarity and PCA projections of feature vectors. We create three different learning architectures by utilizing LSTM networks, 1D-CNN networks, and Random Forests. The four most popular video summarization datasets, namely, TVSum, SumMe, OVP, and VSUMM are used to evaluate the accuracy of the proposed solutions. This includes the Precision, Recall, F-score measures, and computational time. It is shown that the proposed solutions when trained and tested on all subjective user summaries, achieved F-scores of 0.79, 0.74, 0.88, and 0.81, respectively, for the aforementioned datasets, showing clear improvements over prior studies.
  • Publication
    Assessing test suites of extended finite state machines against model and code based faults
    (John Wiley & Sons, 2021) El-Fakih, Khaled; Alzaatreh, Ayman; Turker, Uraz Cengiz
    Tests can be derived from extended finite state machine (EFSM) specifications considering the coverage of single-transfer faults, all transitions using a transition tour, all-uses, edge-pair, and prime path with side trip. We provide novel empirical assessments of the effectiveness of these test suites. The first assessment determines for each pair of test suites if there is a difference between the pair in covering EFSM faults of six EFSM specifications. If the difference is found significant, we determine which test suite outperforms the other. The second assessment is similar to the first; yet, it is carried out against code faults of 12 Java implementations of the specifications. Besides, two assessments are provided to determine whether test suites have better coverage of certain classes of EFSM (or code) faults than others. The evaluation uses proper data transformation of mutation scores and p-value adjustments for controlling Type I error due to multiple tests. Furthermore, we show that subsuming mutants have an impact on mutation scores of both EFSM and code faults; and accordingly, we use a score that removes them in order not to invalidate the obtained results. The assessments show that all-uses tests were outperformed by all other tests; transition tours outperformed both edge-pair and prime path with side trips; and single-transfer fault tests outperformed all other test suites. Similar results are obtained over the considered EFSM and code fault domains, and there were no significant differences between the test suites coverage of different classes of EFSM and code faults.
  • Publication
    AgroAId: A Mobile App System for Visual Classification of Plant Species and Diseases Using Deep Learning and TensorFlow Lite
    (MDPI, 2022) Reda, Mariam; Suwwan, Rawan; Alkafri, Seba; Rashed, Yara; Shanableh, Tamer
    This paper aims to assist novice gardeners in identifying plant diseases to circumvent misdiagnosing their plants and to increase general horticultural knowledge for better plant growth. In this paper, we develop a mobile plant care support system (“AgroAId”), which incorporates computer vision technology to classify a plant’s [species–disease] combination from an input plant leaf image, recognizing 39 [species-and-disease] classes. Our method comprises a comparative analysis to maximize our multi-label classification model’s performance and determine the effects of varying the convolutional neural network (CNN) architectures, transfer learning approach, and hyperparameter optimizations. We tested four lightweight, mobile-optimized CNNs – MobileNet, MobileNetV2, NasNetMobile, and EfficientNetB0 – and tested four transfer learning scenarios (percentage of frozen-vs.-retrained base layers): (1) freezing all convolutional layers; (2) freezing 80% of layers; (3) freezing 50% only; and (4) retraining all layers. A total of 32 model variations are built and assessed using standard metrics (accuracy, F1-score, confusion matrices). The most lightweight, highaccuracy model is concluded to be an EfficientNetB0 model using a fully retrained base network with optimized hyperparameters, achieving 99% accuracy and demonstrating the efficacy of the proposed approach; it is integrated into our plant care support system in a TensorFlow Lite format alongside the front-end mobile application and centralized cloud database. Finally, our system also uses the collective user classification data to generate spatiotemporal analytics about regional and seasonal disease trends, making these analytics accessible to all system users to increase awareness of global agricultural trends.
  • Publication
    CNN and HEVC Video Coding Features for Static Video Summarization
    (IEEE, 2022) Issa, Obada; Shanableh, Tamer
    This study proposes a novel solution for the detection of keyframes for static video summarization. We preprocessed the well-known video datasets by coding them using the HEVC video coding standard. During coding, 64 proposed features were generated from the coder for each frame. Additionally, we converted the original YUVs of the raw videos into RGB images and fed them into pretrained CNN networks for feature extraction. These include GoogleNet, AlexNet, Inception-ResNet-v2, and VGG16. The modified datasets are made publicly available to the research community. Before detecting keyframes in a video, it is important to identify and eliminate duplicate or similar video frames. A subset of the proposed HEVC feature set was used to identify these frames and eliminate them from the video. We also propose an elimination solution based on the sum of the absolute differences between a frame and its motion-compensated predecessor. The proposed solutions are compared with existing works based on an SIFT flow algorithm that uses CNN features. Subsequently, an optional dimensionality reduction based on stepwise regression was applied to the feature vectors prior to detecting key frames. The proposed solution is compared with existing studies that use sparse autoencoders with CNN features for dimensionality reduction. The accuracy of the proposed key-frame detection system was assessed using the positive predictive values, sensitivity, and F-scores. Combining the proposed solution with Multi-CNN features and using a random forest classifier, it was shown that the proposed solution achieved an average F-score of 0.98.
  • Publication
    Classifying Maqams of Qur'anic Recitations Using Deep Learning
    (IEEE Access, 2021) Shahriar, Sakib; Tariq, Usman
    The Holy Qur’an is among the most recited and memorized books in the world. For beautification of Qur’anic recitation, almost all reciters around the globe perform their recitations using a specific melody, known as maqam in Arabic. However, it is more difficult for students to learn this art compared to other techniques of Qur’anic recitation such as Tajwid due to limited resources. Technological advancement can be utilized for automatic classification of these melodies which can then be used by students for self-learning. Using state-of-the-art deep learning algorithms, this research focuses on the classification of the eight popular maqamat (plural of maqam). Various audio features including Mel-frequency cepstral coefficients, spectral, energy and chroma features are obtained for model training. Several deep learning architectures including CNN, LSTM, and deep ANN are trained to classify audio samples from one of the eight maqamat . An accuracy of 95.7% on the test set is obtained using a 5-layer deep ANN which was trained using 26 input features. To the best of our knowledge, this is the first ever work that addresses maqam classification of Holy Qur’an recitations. We also introduce the “Maqam-478” dataset that can be used for further improvements on this work.
  • Publication
    Prediction of EV Charging Behavior Using Machine Learning
    (IEEE Access, 2021) Shahriar, Sakib; Al-Ali, Abdul-Rahman; Osman, Ahmed; Dhou, Salam; NIJIM, MAIS
    As a key pillar of smart transportation in smart city applications, electric vehicles (EVs) are becoming increasingly popular for their contribution in reducing greenhouse gas emissions. One of the key challenges, however, is the strain on power grid infrastructure that comes with large-scale EV deployment. The solution to this lies in utilization of smart scheduling algorithms to manage the growing public charging demand. Using data-driven tools and machine learning algorithms to learn the EV charging behavior can improve scheduling algorithms. Researchers have focused on using historical charging data for predictions of behavior such as departure time and energy needs. However, variables such as weather, traffic, and nearby events, which have been neglected to a large extent, can perhaps add meaningful representations, and provide better predictions. Therefore, in this paper we propose the usage of historical charging data in conjunction with weather, traffic, and events data to predict EV session duration and energy consumption using popular machine learning algorithms including random forest, SVM, XGBoost and deep neural networks. The best predictive performance is achieved by an ensemble learning model, with SMAPE scores of 9.9% and 11.6% for session duration and energy consumptions, respectively, which improves upon the existing works in the literature. In both predictions, we demonstrate a significant improvement compared to previous work on the same dataset and we highlight the importance of traffic and weather information for charging behavior predictions.
  • Publication
    Fog Computing Approach for Shared Mobility in Smart Cities
    (MDPI, 2021) Aburukba, Raafat; Al-Ali, Abdul-Rahman; Riaz, Ahmed H.; Al Nabulsi, Ahmad; Khan, Danayal; Khan, Shavaiz; Amer, Moustafa
    Smart transportation a smart city application where traditional individual models are transforming to shared and distributed ownership. These models are used to serve commuters for inter- and intra-city travel. However, short-range urban transportation services within campuses, residential compounds, and public parks are not explored to their full capacity compared to the distributed vehicle model. This paper aims to explore and design an adequate framework for battery-operated shared mobility within a large community for short-range travel. This work identifies the characteristics of the shared mobility for battery-operated vehicles and accordingly proposes an adequate solution that deals with real-time data collection, tracking, and automated decisions. Furthermore, given the requirement for real-time decisions with low latency for critical requests, the paper deploys the proposed framework within the 3-tier computing model, namely edge, fog, and cloud tiers. The solution design considers the power consumption requirement at the edge by offloading the computational requests to the fog tier and utilizing the LoRaWAN communication technology. A prototype implementation is presented to validate the proposed framework for a university campus using e-bikes. The results show the scalability of the proposed design and the achievement of low latency for requests that require real-time decisions.
  • Publication
    Non-Destructive Water Leak Detection Using Multitemporal Infrared Thermography
    (IEEE, 2021) Yahia, Mohamed; Gawai, Rahul; Ali, Tarig; Mortula, Maruf; Albasha, Lutfi; Landolsi, Taha
    Waterleakage detection and localization in distribution networks pipelines is a challenge for utility companies. For this purpose, thermal Infrared Radiation (IR) techniques have been widely applied in the literature. However, the classical analysis of IR images has not been robust in detecting and locating leakage, due to presence of thermal anomalies such as shadows. In this study, to improve the detection and location accuracy, a digital image processing tool based on multitemporal IR is proposed. In multitemporal IR analysis, the variation of soil's temperature due to field temperature can be obtained; and hence; estimating variations due to water leakage would be more accurate. An experimental setup was built to evaluate the proposed multitemporal IR water leak detection method. In order to consider the temporal temperature variation due to water leakage and mitigate the field temperature effects, a luminance transformation of the IRimages was introduced. To determine the temporal temperature variation of the soil's surface due to the leakage, several metrics have been considered such as the difference, the ratio, the log-ratio and the coefficient variation (CV) images. Based on the experimental results, the log-ratio and the CVimages were the most robust metrics. Then, based on log-ratio or the CV image, a temporal variation image (TVI) that traduces the temporal IR luminance variation was introduced. The analysis of the TVI image showed that the CV image is less noisy than the log-ratio image, and can more accurately locate the leakage. Finally, based on TVI histogram, a threshold was de ned to classify the TVI image into leakage/non-leakage areas. Results showed that the proposed method is capable of accurately detecting and locating water leakage, which is an improvement to the false detections of spatial thermal IR analysis.
  • Publication
    Microwave Imaging for Early Breast Cancer Detection: Current State, Challenges, and Future Directions
    (MDPI, 2022) AlSawaftah, Nour Majdi; Elabed, Salma Sami; Dhou, Salam; Zakaria, Amer
    Breast cancer is the most commonly diagnosed cancer type and is the leading cause of cancer-related death among females worldwide. Breast screening and early detection are currently the most successful approaches for the management and treatment of this disease. Several imaging modalities are currently utilized for detecting breast cancer, of which microwave imaging (MWI) is gaining quite a lot of attention as a promising diagnostic tool for early breast cancer detection. MWI is a noninvasive, relatively inexpensive, fast, convenient, and safe screening tool. The purpose of this paper is to provide an up-to-date survey of the principles, developments, and current research status of MWI for breast cancer detection. This paper is structured into two sections; the first is an overview of current MWI techniques used for detecting breast cancer, followed by an explanation of the working principle behind MWI and its various types, namely, microwave tomography and radar-based imaging. In the second section, a review of the initial experiments along with more recent studies on the use of MWI for breast cancer detection is presented. Furthermore, the paper summarizes the challenges facing MWI as a breast cancer detection tool and provides future research directions. On the whole, MWI has proven its potential as a screening tool for breast cancer detection, both as a standalone or complementary technique. However, there are a few challenges that need to be addressed to unlock the full potential of this imaging modality and translate it to clinical settings.
  • Publication
    In-Between Projection Interpolation in Cone-Beam CT Imaging using Convolutional Neural Networks
    (Society of Photo-Optical Instrumentation Engineers (SPIE), 2022) Dweek, Samaa; Dhou, Salam; Shanableh, Tamer
    Respiratory-Correlated cone beam computed tomography (4D-CBCT) is an emerging image-guided radiation therapy (IGRT) technique that is used to account for the uncertainties caused by respiratory-induced motion in the radiotherapy treatment of tumors in thoracic and upper-abdomen regions. In 4D-CBCT, projections are sorted into bins based on their respiratory phase and a 3D image is reconstructed from each bin. However, the quality of the resulting 4D-CBCT images is limited by the streaking artifacts that result from having an insufficient number of projections in each bin. In this work, an interpolation method based on Convolutional Neural Networks (CNN) is proposed to generate new in-between projections to increase the overall number of projections used in 4D-CBCT reconstruction. Projections simulated using XCAT phantom were used to assess the proposed method. The interpolated projections using the proposed method were compared to the corresponding original projections by calculating the peak-signal-to-noise ratio (PSNR), root mean square error (RMSE), and structural similarity index measurement (SSIM). Moreover, the results of the proposed method were compared to the results of existing standard interpolation methods, namely, linear, spline, and registration-based methods. The interpolated projections using the proposed method had an average PSNR, RMSE, and SSIM of 35.939, 4.115, and 0.968, respectively. Moreover, the results achieved by the proposed method surpassed the results achieved by the existing interpolation methods tested on the same dataset. In summary, this work demonstrates the feasibility of using CNN-based methods in generating in-between projections and shows a potential advantage to 4D-CBCT reconstruction.
  • Publication
    Data Embedding in Scrambled Video by Rotating Motion Vectors
    (Springer, 2022-03) Ahmed, Afaf Eltayeb Mohamedelbagir; Shanableh, Tamer
    Data embedding in videos has several important applications including Digital Rights Management, preserving confidentiality of content, authentication and tampering detection. This paper proposes a novel data embedding solution in scrambled videos by rotating motion vectors of predicted macroblocks. The rotation of motion vectors and the propagation of motion compensation error serve another purpose, which is video scrambling. A compliant decoder uses machine learning to counter-rotate the motion vectors and extract embedded message bits. To achieve this, the decoder uses a sequence-dependent approach to train a classifier to distinguish between macroblocks reconstructed using rotated and un-rotated motion vectors. In the testing phase, motion vectors belonging to a classified macroblock are compared against the reviewed rotated motion vectors and the message bits are extracted. Furthermore, to guarantee accurate classification at the decoder, a constrained encoding approach is proposed in which data embedding is restricted to motion vectors that can be correctly counter-rotated at the decoder. The proposed solution is referred to as Classifying Rotated Vectors or CRVs for short. Experimental results revealed that scrambled videos can be reconstructed correctly without quality loss with a bitrate increase at the encoder of around 6% and an average data embedding rate of 1.68 bits per MB.
  • Publication
    HEVC Video Encryption with High Capacity Message Embedding by Altering Picture Reference Indices and Motion Vectors
    (IEEE, 2022) Shanableh, Tamer
    A high capacity message embedding in encrypted HEVC video is proposed in this paper. The challenges addressed in this paper include keeping the encrypted video compliant with standardized decoders, correctly decrypting the video and finally, correctly extracting the message bits. The message embedding is achieved by altering the values of reference picture indices and motion vectors which results in scrambled video. Sixteen picture references are used in this work and therefore, combined with alteration of motion vectors, a maximum of six message bits can be embedded per coding unit. Motion vectors are altered by swapping their x and y components and/or changing their signs. This is achieved with full compliance with the HEVC video syntax. To extract message bits, an authorized decoder builds a classification model per video sequence and uses it for predicting the true values of the reference indices and motion vectors. As such, message bits are extracted and the video is correctly reconstructed to its unscrambled state. Coding units that result in misclassification are identified at the encoder and excluded from message embedding. This results in slightly lower embedding rates but ensures accurate video reconstruction. Using nine video sequences of various resolutions that are compressed using four different quantization parameters, the experimental results revealed that the true average message embedding rate is 2.7 bits per coding unit or 173 kbit/s. This is achieved with accurate video reconstruction at the expense of increasing the bitrate of the encoder by 3%. Comparison with existing work shows that the proposed solution is superior in terms of embedding capacity whilst reducing the excessive bitrate of the encoder.
  • Publication
    Using C++ to Calculate SO(10) Tensor Couplings
    (MDPI, 2021-10-04) Bhagwagar, Azadan; Syed, Raza
    Model building in SO(10), which is the leading grand unification framework, often involves large Higgs representations and their couplings. Explicit calculations of such couplings is a multi-step process that involves laborious calculations that are time consuming and error prone, an issue which only grows as the complexity of the coupling increases. Therefore, there exists an opportunity to leverage the abilities of computer software in order to algorithmically perform these calculations on demand. This paper outlines the details of such software, implemented in C++ using in-built libraries. The software is capable of accepting invariant couplings involving an arbitrary number of SO(10) Higgs tensors, each having up to five indices. The output is then produced in LATEX, so that it is universally readable and sufficiently expressive. Through the use of this software, SO(10) coupling analysis can be performed in a way that minimizes calculation time, eliminates errors, and allows for experimentation with couplings that have not been computed before in the literature. Furthermore, this software can be expanded in the future to account for similar Higgs–Spinor coupling analysis, or extended to include further SO(N) invariant couplings.
  • Publication
    Detecting Double and Triple Compression in HEVC Videos Using the Same Bit Rate
    (Springer, 2021) Youssef, Seba; Shanableh, Tamer
    Digital video forensics refers to the process of analysing, examining, evaluating and comparing a video for use in legal matters. In digital video forensics, the main aim is to detect and identify video forgery to ensure a video’s authenticity. When a video is edited, the original bitstream is first decoded, edited and then re-compressed. Therefore detecting re-compression in videos is a major step in digital video forensics. Video editing can be applied many times leading to multiple compressions. Thus, finding out the compression history of a video becomes an important mean for detecting any manipulation and thereby identifying the legitimacy of a video. In this work, we propose a machine learning approach to detecting double and triple compression in videos coded using the High Efficiency Video Coding (HEVC) format. Feature variables are extracted from Coding Units (CUs) and summarized into picture and Group of Pictures (GoP) feature vectors. Two classifiers are used for classifying videos into single, double and triple compression, namely; Random Forest (RF) and bi-directional Long Short-Term Memory (bi-LSTM). The latter classifier is important in digital video forensics as it exploits the temporal dependencies between feature vectors. In the experimental results, 127 video sequences are used for verifying the accuracy of the proposed solutions. Results are reported in terms of classification accuracy, confusion matrices, precision and recall. The experimental results revealed that both double and triple compression can be accurately detected using the proposed solutions with results superior to existing work.