This programming assignment involves writing a program to analyze some recent data about fake news articles and identify what they most commonly focus on.
The input data for this assignment are derived from www.kaggle.com/mrisdal/fake-news. For use in this assignment, the original data file has been modified as follows:
If you would like to examine or work with the original un-edited data, see the website mentioned above.
input('File: ').
We will refer to the resulting list of words as the "cleaned list".
input('N: ').N should be ≥ 0.
For example, suppose the linked list has the following values for word counts (in descending order of count):
monsoon: 10and N = 3, then the word at position 3 is chili, so k = 5 (because chili has count 5).
heat: 7
dry: 7
chili: 5
canyon: 5
bicycle: 3
room: 2
SInce the primary purpose of this assignment is to practice using linked lists, code that tries to sidestep linked lists, e.g., by copying data to or from Python lists and doing some or all of the actual work of the program in Python lists, will not get credit. (Python's csv module returns a Python list when reading from a file. This is OK. Your code should then process the strings in that list and insert them into your linked list, and subsequent processing should be done using linked lists.)
0
identifier1
...2
author3
date published4
title of story5
language6
...7
site URL8
country9–19
...
The highlighted field (field 4, "title of story") is the field this program will analyze.
print("{} : {:d}".format(word, count))where word is a string (a word occurring in some title in the input file) and count is an integer value giving the number of occurrences of that string in the title field of posts in the input file.
- class Node
- An object of this class represents information about a word. It should contain (at least) the following attributes:
- _word: the word string.
- _count: a count of the number of occurrences of the word.
- _next: a reference to the next node in the linked list.
It should implement (at least) the following methods:
- __init__(self, word): initializes the object's attributes as follows: _word is set to word; _count is set to 1; _next is set to None.
- word(self): returns the value of _word.
- count(self): returns the value of _count.
- next(self): returns the value of _next.
- set_next(self, target): sets the _next attribute to target.
- incr(self): increments the value of _count.
- __str__(self) and, optionally, __repr__(self).
- class LinkedList
- An object of this class represents a linked list. It should contain (at least) the following attributes:
It should implement (at least) the following methods:
- _head: a reference to the first node in the list.
- __init__(self): initializes _head to None.
- is_empty(self): returns a Boolean indicating whether or not the list contains any nodes.
- head(self): returns a reference to the first node in the linked list, None if the list is empty.
- update_count(self, word): if word is present in the list, increments its count; otherwise, adds a node for it with count initialized to 1.
- rm_from_hd(self): removes the first node in the linked list (updates the list's _head attribute appropriately) and returns this node. It generates an error if this method is invoked on an empty list.
- insert_after(self, node1, node2): node1 and node2 are references to nodes. Inserts node2 into the list containing node1 so that the node that comes after node1 is node2.
- sort(self): sorts the linked list in descending order by _count. See here for an algorithm.
- get_nth_highest_count(self, n): returns the count associated with the node in the linked list at position n.
- print_upto_count(self, n): print out all the words that have count at least n.
- __str__(self) and, optionally, __repr__(self).
Note It is OK to use or adapt the code you were given for the short problems for this assignment. If you do so, however, then for each such method that you did not write yourself, you must add a comment above the method indicating that it is taken from, or based on, code you were given as part of the short problem.
The following are errors:
Program behavior: Use a try statement to detect this. Give an error message and quit.
Error message: "ERROR: Could not open file " + filename
Program behavior: Use a try statement to detect this. Give an error message and quit.
Error message: "ERROR: Could not read N"
Program behavior: Use an assert to detect this. An assert failure will terminate the program. No error message is needed.