Using WEKA https://waikato.github.io/weka-wiki/downloading_weka/

Ace your studies with our custom writing services! We've got your back for top grades and timely submissions, so you can say goodbye to the stress. Trust us to get you there!


Order a Similar Paper Order a Different Paper

Using WEKA https://waikato.github.io/weka-wiki/downloading_weka/

Using WEKA https://waikato.github.io/weka-wiki/downloading_weka/
Dr. Said Baadel Phishing is an attempt to gain sensitive personal and financial information (such as usernames and passwords, account details, and social security numbers) with malicious intent via online deception . A resea r ch by Abdelhamid et al. (2014) experimented on over 1350 websites collected from different phishing data archives (752 phishing sites and 601 legitimate ones). The authors identified 16 common features that can help assess and predict an y website type using common Machine Learning Classification algorithms. The table below shows the features selected for the dataset provided. The authors categorized the values from the collected feature as Legitimate (1), Suspicious (0) and Phishy ( – 1 ). Simple rule based algorithms can be used to predict whether a website is legitimate or phishy. According to the authors, the following 3 demonstrate such rules. 1. Phishers hide the suspicious part of the URL to redirect information’s submitted by users or redirect the uploaded page to a suspicious domain. Some researchers suggested when the URL length is greater than 54 characters the URL can be considered phishy. Rule : If URL length < 54 – > Legit URL length P 54 and 6 75 – > Suspicious else – > Phishy 2. The ‘‘@’’ symbol leads the browser to ignore everything prior it and redirects the user to the link typed after it. Rule : If URL has ‘@’ – > Phishy else Legit 3. Another technique used by phishers to scam users is by adding a subdomain to the URL so users may believe they are dealing with an authentic website. Rule : If dots in domain < 3 – > Legit else if . = 3 – > Suspicious else – > Phishy Using WEKA software, answer the following questions based on the Phishing dataset provided. a) Draw a simple co nfusion matrix (general one, not from WEKA) of the possible data scenarios for this Phishing dataset . b) Draw a table that will outline the Accuracy, Precision, Recall, F – Measure, ROC Area of the following Rules based algorithms; RIPPER (JRip), PART, and Dec ision Table. c) Use Decision Trees algorithms (Random Forest, Random Tree) and Artificial Neural Network (Multilayer Perceptron) to compare with the results in part b) above. Do you have better prediction accuracy with these in this dataset ? d) What is your co nclusion in these experiments pertaining to ML algorithms used ? Save your work as PDF and submit on Moodle. Instructions: Download WEKA using the link provided. After successful installation, run the app. O n the Applications column below, click on “Explorer”. In the following dialog box, click “Open file…” and select the Phishing dataset given. WEKA pre – builds the model from the dataset. Select the tab “Classify” and pick the Machine Learning Classification algorithm to answer the questions above. Reference: Abdelhamid et al . (2014) Phishing Detection based Associative Classification Data Mining. Expert Systems With Applications (ESWA), Vol. 41 (2014) , pp. 5948 – 5959.
Using WEKA https://waikato.github.io/weka-wiki/downloading_weka/
Individual Predictive Model – BUSI 650 Dr. Said Baadel Student Name: Student ID: Answer all Questions on the spaces provided. This test should be individual work. Once completed, save the file as a PDF and submit. ** I agree that the work in this assignment is my own work. I acknowledge that I am expected to exercise the utmost academic integrity in all work submitted for this course. I also acknowledge that I have read the FAQ posted on Moodle and understand the consequences of Plagiarism. ** One of the main concerns of the United Nations’ Sustainability Development Goals is maternal mortality around the world and especially in the less developed countries. A study in 20201 was done to analyze the risk factors of women during pregnancy. The following attached data is presented to you for analysis. Upon graduating from UCW, you have been tasked to use Machine Learning (ML) to predict whether females with certain health attributes pose a high risk in maternal mortality or not. Data Set Information: The dataset provided contain 1020 instances of pregnant women, with 7 different attributes including their risk level (high-risk 276, mid-risk 336, and low-risk 408). The 7 attributes recorded are: Age (in years), Systolic Blood Pressure (maximum pressure the heart exerts while beating) in mmHg as SystolicBP, Diastolic BP (amount of pressure in the arteries between beats) in mmHg as DiastolicBP, Blood Sugar (Blood glucose levels is in terms of a molar concentration) in mmol/L as BS, Body Temperature as BodyTemp, Heart Rate (A normal resting heart rate) in beats per minute as HeartRate, and RiskLevel (whether high/medium/low risk of mortality). 1. Draw a simple confusion matrix for the pregnant women mortality test (2 Marks). 2. Draw 1 table highlighting the performance of the following classifiers (RIPPER, PART, Decision Table, Random Forest, J48, Random Tree, Artificial Neural Network, Simple Logistics, and Naïve Bayes). In this table, highlight and group the different classifier types (i.e., bayes, functions, trees, and rule based). Show the following performance measures for your evaluations: Accuracy, Sensitivity/Recall, Specificity, Precision, F-Measure, and ROC Area. (4 Marks) 3. Analyze the results in question 2 above. Explain in detail the performance of the above classifiers by comparing the classifier types (3 Marks). Select 1 classifier that is better suited for the dataset, that you wish to recommend, based on what measure(s) and why. (2 Marks) 4. Run a cluster analysis algorithm (simple k-means) on the dataset. Did the algorithm do a better job in clustering the dataset given that we know the predicted attributes (i.e., high/med/low risk)? (2 Marks)Explain your answer based on the results. (2 Marks) 1Ahmed M., Kashem M.A., Rahman M., Khatun S. (2020) Review and Analysis of Risk Factor of Maternal Health in Remote Area Using the Internet of Things (IoT). 2

Writerbay.net

Looking for top-notch essay writing services? We've got you covered! Connect with our writing experts today. Placing your order is easy, taking less than 5 minutes. Click below to get started.


Order a Similar Paper Order a Different Paper