Item

Application of artificial neural networks for understanding and diagnosing the state of mastitis in dairy cattle

Hassan, Khwaja J.
Date
2007
Type
Thesis
Fields of Research
Abstract
Bovine mastitis adversely affects the dairy industry around the world. This disease is caused by a diverse range of bacteria, broadly categorised as minor and major pathogens. In-line tools that help identify these bacterial groupings in the early stages of the disease are advantageous as timely decisions could be made before the cow develops any clinical symptoms. The first objective of this research was to identify the most informative milk parameters for the detection of minor and major bacterial pathogens. The second objective of this research was to evaluate the potential of supervised and unsupervised neural network learning paradigms for the detection of minor infected and major infected quarters in the early stages of the disease. The third objective was to evaluate the effects of different proportions of infected to non-infected cases in the training data set on the correct classification rate of the supervised neural network models as there are proportionately more non-infected cases in a herd than infected cases. A database developed at Lincoln University was used to achieve the research objectives. Starting at calving, quarter milk samples were collected weekly from 112 cows for a period of fourteen weeks, resulting in 4852 samples with complete records for somatic cell count (SCC), electrical resistance, protein percentage, fat percentage, and bacteriological status. To account for the effects of the stage of lactation on milk parameters with respect to days in milking, data was divided into three days in milk ranges. In addition, cow variation was accounted for by the sire family from which the cow originated and the lactation number of each cow. Data was pre-processed before the application of advanced analytical techniques. Somatic cell score (SCS) and electrical resistance index were derived from somatic cell count and electrical resistance, respectively. After pre-processing, the data was divided into training and validation sets for the unsupervised neural network modelling experiment and, for the supervised neural network modelling experiments, the data was divided into training, calibration and validation sets. Prior to any modelling experiments, the data was analysed using statistical and multivariate visualisation techniques. Correlations (p<0.05) were found between the infection status of a quarter and its somatic cell score (SCS, 0.86), electrical resistance index (ERI, -0.59) and protein percentage (PP, 0.33). The multivariate parallel visualisation analysis validated the correlation analysis. Due to significant multicolinearity [Correlations: SCS and ERI (-0.65: p<0.05); SCS and PP (0.32: p<0.05); ERI and PP (-0.35: p<0.05)], the original variables were decorrelated using principle component analysis. SCS and ERI were found to be the most informative variables for discriminating between non-infected, minor infected and major infected cases. Unsupervised neural network (USNN) model was trained using the training data set which was extracted from the database, containing approximately equal number of randomly selected records for each bacteriological status [not infected (NI), infected with a major pathogen (MJI) and infected with a minor pathogen (MNI)]. The USNN model was validated with the remaining data using the four principle components, days in milk (DIM), lactation number (LN), sire number, and bacteriological status (BS). The specificity of the USNN model in correctly identifying non infected cases was 97%. Sensitivities for correctly detecting minor and major infections were 89% and 80%, respectively. The supervised neural network (SNN) models were trained, calibrated and validated with several sets of training, calibration and validation data, which were randomly extracted from the database in such a way that each set has a different proportion of infected to non-infected cases ranging from 1:1 to 1:10. The overall accuracy of these models based on validation data sets gradually increased with increase in the number of non-infected cases in the data sets (80% for the 1:1, 84% for 1:2, 86% for 1:4 and 93% for 1:10). Specificities of the best models for correctly recognising non-infected cases for the four data sets were 82% for 1:1, 91% for 1:2, 94% for 1:4 and 98% for 1:10. Sensitivities for correctly recognising minor infected cases for the four data sets were 86% for 1:1, 76% for 1:2, 71% for 1:4 and 44% for 1:10. Sensitivities for correctly recognising major infected cases for the four data sets were 20% for 1:1, 20% for 1:2, 30% for 1:4 and 40% for 1:10. Overall, sensitivity for the minor infected cases decreased while that of major infected cases increased with increase in the number non-infected cases in the training data set. Due to the very low prevalence of MJI category in this particular herd, results for this category may be inconclusive. This research suggests that somatic cell score and electrical resistance index of milk were the most effective variables for detecting the infection status of a quarter followed by milk protein and fat percentage. The neural network models were able to differentiate milk containing minor and major bacterial pathogens based on milk parameters associated with mastitis. It is concluded that the neural network models can be developed and incorporated into milking machines to provide an efficient and effective method for the diagnosis of mastitis.