a6
symlink that references /cs/www/classes/cs372/spring18/a6
.
a6/tester
(or a6/t
).
~/.bashrc
has a series of aliases like this:
... alias t5=a5/tester alias t6=a6/tester alias t7=a7/tester ...With those aliases I can just type
t6 switched
to run the tester for switched.rb
on this assignment.
I could go further and have per-problem aliases, too:
alias tsw="a6/tester switched"
Each of the first four problems ask you to write an iterator, which is a single method, rather than a program. The write-up for the first iterator,
eo
, shows a couple of ways to approach the edit-run cycle.
Each of the iterators has a "duck typing" specification that describes what the iterator requires of its argument(s),
such as being subscriptable, or having a size
or each
method.
IMPORTANT: The first test for each iterator is a Ruby program with a name like ITERATOR-ap.rb
. The "ap"
stands for all points—when grading, all the points for the problem will be based on whether you pass the "ap" test. The "ap" tests
use dumbed-down classes that provide only the capabilities that the iterator is supposed to require. For example, eo-ap.rb
uses
a class named Dumb
that provides only subscripting with x[n]
and a size
method. If you're passing some cases for an iterator but failing the "ap" test, then your implementation is requiring more capabilities of its argument(s) than it should be, according to the duck typing specification.
For one point of extra credit, track the the total time you spend on problems 1-4 (the iterators)
and report it in observations.txt
with a line of this form:
Hours for iterators: 1.5
If you do that, do not subtract those hours from your overall hours.
For example, if you spend 1.5 hours on the iterators and another 4 hours on the
rest of the assignment, report Hours: 5.5
in your observations.
eo.rb
eo(x)
that yields every other element in x
.
eo(x)
returns x
.
Duck typing: eo
only requires x
to be subscriptable with x[n]
and have a size
method.
Examples:
>> eo([10,20,30,40]) {|v| puts v } 10 30 => [10, 20, 30, 40] >> eo("testing") {|c| print "'#{c}' "} 't' 's' 'i' 'g' => "testing" >> sum = 0; eo((1..10).to_a) {|v| sum += v}; sum => 25 >> eo([]) {|v| puts v } => []
eo
and the three other iterators.
First, add the following line to your ~/.irbrc
:
$eo="eo.rb"Then, after restarting
irb
, you can do this:
$ irb >> load $eo => true >> eo("abcd") {|c| puts c} a c => "abcd"That works because
load
is a Ruby function. That contrasts with :load
in ghci
, which
is an operation provided by the REPL, not the Haskell language.
You can edit in another window and then do load $eo
to load up the latest.
Another angle to save a little typing is add a method like this to to your .irbrc
:
def ld s load s.to_s + ".rb" endUse that
ld
method like this:
$ irb >> ld :eo => trueBy giving
ld
a
Symbol
,
:eo
above, it's sort of like using a string literal that needs a quote on only one end.
(Atoms in Lisp are similar.)
ld
converts its argument to a string, tacks on the ".rb"
suffix, and calls load
with the result.
Again, we get this sort of flexibility because load
is a Ruby method, but on the other hand, I still miss the filename
completion that ghci
provides.
tkcycle.rb
tkcycle(x, sizes)
that yields consecutive "slices" of x
based on the integers in sizes
.
tkcycle
cycles through the sizes in sizes
until it runs out of elements in x
.
tkcycle
always returns nil
.
Duck typing: tkcycle
only requires x
to support a "slice" operation with x[start,length]
and have a size
method. sizes
is assumed to be a non-empty array of non-negative integers.
Examples:
>> tkcycle((1..10).to_a,[1,2]) {|s| p s} [1] [2, 3] [4] [5, 6] [7] [8, 9] [10] => nil >> tkcycle("just a test", [3]) {|s| puts "slice = #{s}"} slice = jus slice = t a slice = te => nil >> tkcycle("just a test", [30]) {|s| puts "slice = #{s}"} => nil >> tkcycle("just a test", [3,0,2]) {|s| puts ">#{s}<"} >jus< >< >t < >a t< >< >es<Note that
tkcycle("just a test", [3])
above demonstrates that tkcycle
never yields a partial result:
After the third block invocation, with s = " te"
, only "st"
remains, and that's not enough for another
three-element yield.
vrepl.rb
vrepl(x)
that produces an array consisting of varying numbers of repetitions of elements in x
.
The number of repetitions for an element is determined by the result of the block when the iterator yields that element to the block.
Duck typing: vrepl
only requires x
to have an each
method.
Examples:
>> vrepl(%w{a b c}) { 2 } => ["a", "a", "b", "b", "c", "c"] >> vrepl(%w{a b c}) { 0 } => [] >> vrepl((1..10).to_a) { |x| x % 2 == 0 ? 1 : 0 } => [2, 4, 6, 8, 10] >> i = 0 => 0 >> vrepl([7, [1], "4"]) { i += 1 } => [7, [1], [1], "4", "4", "4"]If the block produces a negative value, zero repetitions are produced:
>> vrepl([7, 1, 4]) { -10 } => []
mirror.rb
mirror(x)
that yields a "mirrored" sequence of values based on the values that x.each
yields.
Duck typing: mirror
only requires that x
implement the iterator each
.
The value returned by mirror
is always nil
.
Examples:
>> mirror(1..3) { |v| puts v } 1 2 3 2 1 => nil >> mirror([1, "two", {a: "b"}, 3.0]) { |v| p v } 1 "two" {:a=>"b"} 3.0 {:a=>"b"} "two" 1 => nil >> mirror({:a=>1, :b=>2, :c=>3}) {|x| p x} [:a, 1] [:b, 2] [:c, 3] [:b, 2] [:a, 1] => nil >> mirror([]) { |v| puts v } => nilLike the previous three iterators,
mirror
is a freestanding method.
switched.rb
switched.rb
to look for names that change from predominantly male to predominantly female in given spans of years.
switched.rb
takes two command-line arguments: a starting year and an ending year. Here's a run:
$ ruby switched.rb 1951 1958 1951 1952 1953 1954 1955 1956 1957 1958 Dana 1.19 1.20 1.26 1.29 1.00 0.79 0.67 0.64 Jackie 1.40 1.29 1.14 1.13 1.11 0.94 0.72 0.57 Kelly 4.23 2.74 3.73 2.10 2.32 1.77 0.98 0.51 Kim 2.58 1.82 1.47 1.08 0.61 0.30 0.17 0.12 Rene 1.43 1.32 1.15 1.24 1.13 0.88 0.87 0.89 Stacy 1.06 0.81 0.62 0.47 0.44 0.36 0.29 0.21 Tracy 1.51 1.14 1.02 0.73 0.56 0.55 0.59 0.59First, note that all numbers in the leftmost column are greater than one and all numbers in the rightmost column are less than one. The 1.19 for Dana in 1951 indicates that in 1951 there were 1.19 times as many male babies named Dana as there were female babies named Dana. We can see that in
a6/yob/1951.txt
, which has the 1951 data:
$ grep Dana, a6/yob/1951.txt Dana,F,1076 Dana,M,1277The format of the
a6/yob/YEAR.txt
data files is simple: each line has the name, sex, and the associated count, separated by commas.
Note that the argument to grep
, "Dana,
" has a trailing comma so that "Danae
" doesn't turn up, too.
By 1958 things had changed—there were only 0.64 males named Dana for every female named Dana:
$ grep Dana, a6/yob/1958.txt Dana,F,2388 Dana,M,1531
switched
reads the a6/yob/YEAR.txt
files for all the years in the range specified by the command line arguments
and looks for names for which the male/female ratio is > 1.0 in the first year and < 1.0 in the last year.
For all the names it finds, it prints the male/female ratio for all the years from the first year through the last year. Names are printed in alphabetical order.
As a specific example, Dana is included in the list for 1951 through 1958 because males/females in 1951 was 1.19 (> 1.0) and males/females in 1958 was 0.64 (<1.0). The ratios for the middle years are not examined to decide whether to include a name; they are shown only to provide a more complete picture of the data between the endpoints.
Note that there's a big shift for Kim from 1954 through 1957. I wonder if that's because the actress Kim Novak had a breakout role in 1955's Picnic.
Spring 2016 student Mai Tong noticed this for "Addison":
1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 Addison 1.11 0.82 0.68 0.53 0.53 0.44 0.36 0.28 0.25 0.21 0.13 0.06 0.03 0.02 0.02 0.02
Ms. Tong wrote,
I looked up on Google to find the reason why and I found that the name means "Son/Child of Adam", which is interesting since Addison is now commonly used as a girl's name rather than boy's. Possibly due to uncommon use for a girls name back in the early 1990s? But looking closely, there's a show called Grey's Anatomy (premiered in 2005) with a popular female-lead named Dr. Addison Montgomery which might be why the ratio dropped dramatically in 2005 and onwards.
Back to the details, if no names meet the criteria, switched
prints "no names found
" and exits by calling exit
:
$ ruby switched.rb 2011 2012 no names foundIMPORTANT: To eliminate the less significant results, a name is included only if both the male and female counts in both the first and last year are greater than or equal to 100. By that criteria the name Lavern is not included for 1949-1951, and no other names turn up, either:
$ ruby switched.rb 1949 1951 no names foundHere's the underlying data:
$ grep Lavern, a6/yob/1949.txt Lavern,F,93 Lavern,M,121 $ grep Lavern, a6/yob/1951.txt Lavern,F,95 Lavern,M,86There was a M/F shift from 121/93 in 1949 to 86/95 in 1951 but because not all four of those counts are >= 100, Lavern is not included. There's no
grep
shown above for 1950 because that data is inconsequential: inclusion is determined solely based on counts for
the first and last year.
It's interesting to combine switched
with a Bash for
-loop that runs the program with a gradually shifting range.
When your switched
is done, try this:
for i in $(seq 1940 2005); do ruby switched.rb $i $(($i+9)); echo ===; doneTwo obvious extensions to
switched
would be command-line options to adjust the 100-baby minimum and to look for female to male flips for a name. You might find those interesting to implement and experiment with, but neither are required.
switched
does no error handling whatsoever. Behavior is only defined in the case of being given two command line arguments in the range of 1885 to 2014, and the first must be less than the second.
switched
I intend this problem to be an exercise in using the Hash
class. I encourage you to devise a data structure yourself but in case you run
into trouble, here are a few thoughts on my approach to the problem:
http://www.cs.arizona.edu/classes/cs372/spring18/a6/switched-hint.html
It's easy to drown in the data on a problem like this.
You might start by having your code that reads the YEAR.txt
files discard data for everybody but "Dana" and then use
p x
(equivalent to puts x.inspect
) to dump out your data structure.
Then try adding in a male-only and a female-only name, like "Delbert" and "Mimi" in 1951.
Alternatively, you might edit down some data files to just a few lines of interest. (After copying any of the a6/yob
files into
your directory, use chmod 600 FILE
so that you can edit it; cp
will have left it as mode 444
—read only.)
Watch out for bugs related to integer division. (Use .to_f
to get a Float
when needed.)
Use File.open
to produce a File
object whose gets
method can be used to read lines. Example:
$ cat fileio.rb year = ARGV[0] f = File.open("a6/yob/#{year}.txt") count = 0 while line = f.gets count += 1 end f.close puts "read #{count} lines" $ ruby fileio.rb 2001 read 30258 linesAlternatively, you could use
f.readlines()
to produce an array of all the lines in the file with a single call or
f.each { ... }
to process each line with the associated block.
The M/F ratios are formatted using a %7.2f
format with printf
, demonstrated on the command line with
ruby -e
:
$ ruby -e 'printf("%7.2f\n", 1277.0/1076.0)' 1.19
Names are left-justified in a 10-wide field using a printf
format of %-10s
.
You may have questions about the data files. Before mailing to us or posting on Piazza, take a look at the data files and see if
you can answer the question yourself. The files are in a6/yob
. Those same files will be used for testing when grading.
You can download http://www.cs.arizona.edu/classes/cs372/spring18/a6/yob.zip
for testing on your own machine.
In the same directory as your switched.rb
, make a directory named a6
and then unzip yob.zip
in that directory to produce a structure compatible with the File.open
show above.
label.rb
Deadline exception:
To give you a little schedule flexibility, you may do this problem, label.rb
, as part of assignment 7
rather than this assignment.
If you wish to do that, simply don't submit a solution for label.rb
on this assignment. You'll see a zero for this
problem on assignment 6, and your overall average will be about a point or so low, but you'll make it up on assignment 7. Similarly,
students who submit label.rb
on this assignment should not submit label.rb
on assignment 7, and will see a zero
for it there.
Barring the unforeseen, assignment 7 will be due on Friday, April 6.
In lecture we've seen that Array.inspect
, which is used by Kernel.p
and by irb
,
does not accurately depict an array that contains multiple references to the same array and/or references itself. Example:
>> a = [] >> b = [a,a] >> p b [[], []]
By simply examining the output of p b
we can't tell whether b
references two distinct arrays or has two references to
the same array.
Another problem is that if an array references itself, Ruby "punts" and shows "[...]
":
>> a = [] >> a << a >> p a [[...]]The problems continue if we then get a
Hash
involved:
>> h = {} >> h[a] = h >> p h {[[...]]=>{...}}
For this problem you are to write a method label(x)
that produces a labeled representation of x
,
where x
is a string, number, array or hash. Arrays and hashes may in turn reference other arrays and hashes, and be cyclic.
Here's what label
shows for the first case above.
>> a = []; b = [a,a] >> puts label(b) a1:[a2:[],a2]The outermost array is labeled as
a1
. Its first element is an empty array, labeled a2
.
The second element is a reference to that same empty array, but its contents are not shown, only the label a2
.
Note: For reasons described later, we'll use puts label(...)
to show the string that label
produces.
Here's another step, and the result:
>> c = [b,b] >> puts label(c) a1:[a2:[a3:[],a3],a2]Note that the label numbers are not preserved across calls. The array that this call labels as
a3
was labeled as a2
in the previous example.
To explore relationships between the contents of a
, b
, and c
we could wrap them in an array:
>> puts label([a,b,c]) a1:[a2:[],a3:[a2,a2],a4:[a3,a3]]Here's a simple cyclic case. The third element in
a
is a reference to a
; representing that reference with
a label lets us see the cycle.
>> a = [] >> a = [10,20] >> a << a >> puts label(a) a1:[10,20,a1]Let's try a simple hash:
>> h = {} >> h["one"] = 1 >> h[2] = "two" >> puts label(h) h1:{"one"=>1,2=>"two"}Note that hashes are labeled as "
hN
". The key/value pairs are shown with "thick" arrows.
Pairs are separated with commas, and curly braces surround the list of pairs.
Let's add some arrays into the mix:
>> h = {} >> a = [2,2,2] >> a << a >> h["twos"] = a >> h["words"] = %w{just a test} >> puts label(h) h1:{"twos"=>a1:[2,2,2,a1],"words"=>a2:["just","a","test"]}Note that arrays and hashes have separate numbering: the above labeling shows both
h1
and a1
, for example.
Let's try h[h] = h
:
>> h[h] = h >> puts label(h) h1:{"twos"=>a1:[2,2,2,a1],"words"=>a2:["just","a","test"],h1=>h1}
label(h)
eventually reaches the key/value pair made by h[h] = h
.
Because h
has already been labeled as h1
, the presence of that key/value pair is shown as h1=>h1
.
Another example:
>> a = [1,2,3] >> a << [[[a,[a]]]] >> a << a >> puts label(a) a1:[1,2,3,a2:[a3:[a4:[a1,a5:[a1]]]],a1]One more example:
>> a = [1,2,3] >> b = {"lets" => "abc", 1 => a} >> 3.times { a << b } >> a << "end" >> puts label([a,b,{100=>200}]) a1:[a2:[1,2,3,h1:{"lets"=>"abc",1=>a2},h1,h1,"end"],h1,h2:{100=>200}]The trivial case for
label(x)
is that x
is not an array or hash. If so, label(x)
returns x.inspect
:
>> puts label(4) 4 >> puts label("testing") "testing"Along with the two cases immediately above, here are some simple cases that are good for getting started:
>> puts label([7]) a1:[7] >> puts label([[7]]) a1:[a2:[7]] >> puts label([[70],[80,"90"]]) a1:[a2:[70],a3:[80,"90"]]Keep in mind that your solution must be able to accommodate an arbitrarily complicated structure but the only types you'll encounter are numbers, strings, arrays and hashes.
The above examples use puts label(...)
rather than showing the result of label(...)
.
Let's look at a label
call without puts
:
>> label([1,"two",["3"]]) => "a1:[1,\"two\",a2:[\"3\"]]"
label
returns a string and irb
uses inspect
to show an unambiguous representation of results,
so any quotes in the result are "escaped" with a backslash, so that the result is shown as a valid string literal.
Using puts
lets us avoid that clutter:
>> puts label([1,"two",["3"]]) a1:[1,"two",a2:["3"]] => nil
label.rb
http://www.cs.arizona.edu/classes/cs372/spring18/a6/label-hint.html
IMPORTANT: You must match the sequence of label numbers that my solution produces. That essentially requires you to traverse the structure in the same order I do, which is depth-first. Here's an example that evidences that depth-first traversal:
>> puts label([[[10]],[[21,22]]]) a1:[a2:[a3:[10]],a4:[a5:[21,22]]]Note that the deeply nested
[10]
was labeled with a3
before the second element of the top-level list was labeled
with a4
.
Also, note that I iterate through key/value pairs in a hash using h.each {|k,v| ... }
.
label
label.rb
. It's this: ruby a6/label1.rb
.
You may need to dig around in a6/label1.rb
to figure out exactly what code is being executed for a particular failure and
associated "diff".
It may be useful to copy a6/label1.rb
into your directory and hack it up, maybe adding Kernel.p
calls or making
other changes to track down a bug.
In various places in the output from a6/label1.rb
you'll see something like this:
Line 35 in a6/label1.rb: label([a,b,c]): a1:[a2:[],a3:[a2,a2],a4:[a3,a3]]That first line tells us that the test originates at line 35 in
a6/label1.rb
. Here's that line:
sv("label([a,b,c])", bb)
sv
, standing for "show values", is a helper method that uses Ruby's Kernel.eval
method
to evaluate the expression specified by the first argument. The second argument, bb
, is the current set of variable
bindings, which are used by eval
.
label
is a simplified version of the Image(x)
procedure from the Icon library.
In CSC 550A/B, String and List Processing,
taught in the 1980s, Ralph Griswold used Image(x)
as an example. Icon's Image(x)
, and label(x)
,
which you are about to write in relatively few lines of code, are incredibly handy for dealing with cycling data structures
but I know of no equally powerful function that's part of the standard library in any other language.
From time to time I've asked in various IRC channels and web forums in search of something similar in other languages but those questions are often met with "Why would you need that?", "There's no reason for an array to reference itself!", etc. Of course, graphs and "a maze of twisty little passages, all alike" are immediate applications of self-referential data structures.
If you ever come across anything like the label(x)
method you're about to write, let me know!
pancakes.rb
The Ruby version is a program that reads lines from standard input, one "order" per line, echoes the order, and then shows the pancakes.
Example:
$ cat a6/pancakes.1 3 1 / 3 1 5 3 1 3 1 5/ 1 1 1/11 3 15 /3 3 3 3/1 1 $ ruby pancakes.rb < a6/pancakes.1 Order: 3 1 / 3 1 5 *** *** * * ***** Order: 3 1 3 *** * *** Order: 1 5/ 1 1 1/11 3 15 /3 3 3 3/1 *** * *********** *** * * *** *** ***** * *************** *** * Order: 1 * $A blank line is printed after the "
Order:
" line and again after the stacks.
Make these assumptions:
1 / / 3" or "/ 3 /
".
If an order specifies an even-width pancake, the message shown below is printed. Processing then continues with the next order in the input, if any.
$ ruby pancakes.rb < a6/pancakes.2 Order: 1 3 1 / 1 2 3 Even-width pancake. Order ignored. Order: 51 49 *************************************************** ************************************************* $In case you want to play "Beat the Teacher", I'll tell you that it took me about 25 minutes to write
pancakes.rb
,
sketching on paper included.
If you care to, let me know how long it takes you. Think about it all you want "for free", but start the clock the moment a tangible artifact is produced, like a mark on a piece of paper.
observations.txt
Submit a plain text file named observations.txt
with...
(a) (1 point extra credit) An estimate of how many hours it took you to complete this assignment. Put that estimate on a line by itself, like this:
Hours: 3.5There should be only one
"Hours:"
line in observations.txt
. (It's fine if you care to provide per-problem times, and that data is useful to us, but report it in some form of your own invention that doesn't contain the string "Hours:"
. Thanks!)
Feedback and comments about the assignment are welcome, too. Was it too long, too hard, too detailed? Speak up! I appreciate all feedback, favorable or not.
(b) (1-3 points extra credit) Cite an interesting course-related observation (or observations) that you made while working on the assignment. The observation should have at least a little bit of depth. Think of me thinking "Good!" as one point, "Excellent!" as two points, and "Wow!" as three points. I'm looking for quality, not quantity.
Use a6/turnin
to submit your work. Each run creates a time-stamped "tar file" in your current directory with a name like aN.YYYYMMDD.HHMMSS.tz
.
You can run a6/turnin
as often as you want. We'll grade your final pre-deadline submission.
Use a6/turnin -l
to list your submissions.
To give you an idea about the size of my solutions, here's what I see as of press time:
$ wc $(grep -v txt a6/delivs) # this command should work for you, too 8 17 93 eo.rb 19 37 332 tkcycle.rb 8 29 158 vrepl.rb 17 23 205 mirror.rb 57 142 1241 switched.rb 22 57 611 label.rb 39 112 851 pancakes.rb 170 417 3491 totalMy code has few comments.
Note that each of the a6.*.tz
files created in your directory by a6/turnin
is essentially a
time-stamped snapshot of your code. (If you need to recover a file and aren't familiar with tar
, perhaps mail to 372s18
—it's easy to accidentally overwrite your latest versions.)
Point values of problems correspond closely to the "assignment points" mentioned in the syllabus. For example, a 10-point problem would correspond to about 1% of your final grade in the course.
Feel free to use comments to document your code as you see fit, but note that no comments are required, and no points will be awarded for documentation itself. (In other words, no part of your score will be based on documentation.)
In Ruby, #
is comment to end of line, unless in a string literal or regular expression.
There's no Ruby analog for /* ... */
in Java and {- ... -}
in Haskell but
you can comment out multiple lines by making them an embedded document—lines bracketed
with =begin/=end
starting in column 1. Example:
=begin Just some comments here. =end
Remember that late assignments are not accepted and that there are no late days; but if circumstances beyond your control interfere with your work on this assignment, there may be grounds for an extension. See the syllabus for details.
My estimate is that it will take a typical CS junior from 7 to 9 hours to complete this assignment.
Our goal is that everybody gets 100% on this assignment AND gets it done in an amount of time that is reasonable for them.
Assignments are not take-home tests! We hope you'll make use of Piazza, email and office hours if you run into trouble. We're happy to help you get started with any or all of these problems but in any event, if you put six hours into this assignment and don't seem to be close to completing it, it's definitely time to touch base with us. Specifically mention that you've reached six hours. Give us a chance to speed you up!