Aside from one problem, bench.txt
, there's no programming at all on this assignment. It's a combination of web research, creative thought, pondering, and a little writing.
For each of the problems whose name ends in .txt
(like morefacts.txt
and quote.txt
) you are to answer by turning in a plain ASCII text file. Use an editor like Sublime, Vim, Emacs, Notepad++, etc. DO NOT submit Word documents, PDFs, Rich Text files, HTML documents, etc. As a double-check, your .txt
files should look perfectly fine when displayed with cat
on lectura. Also, the file
command should show them as "ASCII text
" or maybe "ASCII English text
", or something similar.
For the .txt
problems on this assignment and others in this class we're
always happy to look at your work in advance and give you a quick opinion on it.
You can either mail it to 372s23
, or turn it in and tell us you did.
Don't wait too long, however, there may be a crunch as the deadline approaches.
spring23/a2
directory
The directory /cs/www/classes/cs372/spring23/a2
on lectura has a couple of shell scripts,
turnin
and run-bench
, and other files that you'll need to use to
complete the assignment. The instructions in this write-up assume that you're working on lectura
AND that you've made a symbolic link to the spring23/a2
directory.
Imagining that you're maintaining your assignment 2 files on lectura in
a directory named ~/csc372/assignment2
, here's how to make that symbolic link in the right place and then examine it with ls
:
% cd ~/csc372/assignment2 % ln -s /cs/www/classes/cs372/spring23/a2 a2 % ls -l a2 lrwxrwxrwx 1 whm root 33 Jan 16 14:57 a2 -> /cs/www/classes/cs372/spring23/a2
If we now do ls -l a2/
(don't forget the slash!), we'll see files that are actually in
/cs/www/classes/cs372/spring23/a2
: (files present, sizes, and dates may vary)
% ls -l a2/ total 19 -rw-r--r-- 1 whm whm 1026 Jan 16 21:32 bench-avg.py -rw-rw-r-- 1 whm whm 157 Jan 16 22:31 delivs -rwxr-xr-x 1 whm whm 1032 Jan 16 20:58 run-bench -rwxr-xr-x 1 whm whm 801 Sep 16 20:34 test-bench -rwxr-xr-x 1 whm whm 232 Jan 16 14:52 test-benches -rwxrwxr-x 1 whm whm 495 Jan 16 14:53 turnin -rw-r--r-- 1 whm whm 1917 Jan 16 22:12 whm-bench.txt -rw-r--r-- 1 whm whm 4294 Jan 16 16:15 whm-ten-things.txt
If you haven't worked with symbolic links, see slides 134+ in my 352 UNIX slides.
morefacts.txt
When covering slide 28 in the intro set I said a sentence or two about various languages of historical or current significance. For this problem I'd like you to find three languages that are not mentioned on that slide and tell me a sentence or two about each.
I'll compile all the answers and post them on Piazza. Follow this format for your answers:
cat
and fold
to display my morefacts.txt
and then wc -l
to demonstrate it's exactly three lines:
% cat morefacts.txt | fold -75 Ada 1980: The DoD's attempt to have one language for military embedded syst ems, instead of 450. -- William Mitchell Java 1995: The most rapidly adopted language of all time. In Spring 1997 I gave one lecture in 372 about Java as a rising language; by Fall 1998 it was being taught in 127A. -- William Mitchell Scala 2003: Proof that Germans should stick to beer and BMWs. -- anonymous % wc -l morefacts.txt 3 morefacts.txt
Feel free to use Google, Wikipedia, etc., for research on this question but needless to say, no posts anywhere soliciting ideas.
Above I say, "the year it appeared" but that's often subject to debate. Feel free to believe Wikipedia or go with another source.
Just to be clear, you may use anonymous attribution to keep your classmates from knowing you wrote a particular entry but what you submit must be original, not something you found on the net.
Some of you may find that you share a first, middle, or last name with a programming language.
quote.txt
We started the class with a quote from Alan J. Perlis about programming languages. For this problem I'd like you to find a quote you like about programming languages by somebody other than Perlis.
Put the quote in quote.txt
in any format you'd like. Here's an example using Perlis' quote:
% cat quote.txt "A language that doesn't affect the way you think about programming is not worth knowing." --Alan J. PerlisImportant: The quote must be about programming languages in some way, not just about programming or software development in general.
jshell.txt
I think it's pretty stupid that it took the developers/maintainers of Java twenty years to provide a REPL for Java but that's the sad fact: JShell was released as part of JDK 9, in 2017.
For this problem I'd like you to spend maybe fifteen minutes experimenting with JShell and write a few sentences about your observations. You might talk about what you liked, what you expected to work but didn't, what you'd like to change, surprising behaviors, etc. If your 210 instructor showed or encouraged JShell, did it help or hurt when learning Java? If you didn't see JShell in 210, do you think it would have helped you learn Java?
If you've got Java installed on your machine, running jshell
at a command-line prompt should start it up. If not, you can run jshell
on lectura.
Some interaction with JShell is shown on slides 24 and 52 in the Haskell set. Here's a little more interaction with JShell:
% jshell | Welcome to JShell -- Version 11.0.16 | For an introduction type: /help intro jshell> 3 + 4 $1 ==> 7 jshell> $1 + "abc" $2 ==> "7abc" jshell> int i, j = 7; i ==> 0 j ==> 7 jshell> /var | int $1 = 7 | String $2 = "7abc" | int i = 0 | int j = 7 jshell> /exit (or control-D to exit) | Goodbye %Note that
/var
shows variables and their values. /help
shows all the commands.
Also, at the jshell>
prompt, try typing "".s<TAB>
counterparts.txt
Many elements of Python and Java can be called counterparts or said to be analogous.
For example, Python's str
and Java's String
are
counterparts. Java's System.out.println
method is a counterpart to Python's print
function.
For this problem you are to identify fifteen examples of Java and Python counterparts (and elements thereof) that differ significantly from each other.
Here are four examples of significant differences in Python and Java counterparts:
1_000_000
, to improve readability but Python does not allow such.
int
s have unlimited size but Java provides that only through the clumsy BigInteger
class.
+=
to build a long string works fine in Python but performs terribly in Java, where
String{Buffer,Builder}.append(...)
is recommended instead.
Notes:
String.indent()
.
ten-things.txt
Find some programming language that you know essentially nothing about and, using examples of code, tell me ten things about that language. You may not write about Haskell, Prolog, or any LISP.
It's fine to copy code verbatim from wherever you find it, but include a list of your sources.
Take a look at
a2/whm-ten-things.txt
for an example of what we're looking for.
Use any sort of plain-text formatting that you want but number your ten items in some way.
expressions.txt
Let's apply what you've learned about type/value/side-effect by analyzing three expressions. I suggest that you work through each on paper first, and then try it with the Python Shell and JShell. (Remember: I'd call them REPLs, not "shells".)
int x = 3, y = 4, z;what's the type, value, and side effect(s), if any, of the following expression?
z = x++ * --y
float f = 0;what's the type, value, and side effect(s), if any, of the following expression?
Math.abs(-(f += 'f'))
d
is {False:[], None:True}
,
what's the type, value, and side effect(s), if any, of the following expression?
d[d[not 1].append(x:=1)]
Note: The :=
operator is new in Python 3.8. It has become known as the "walrus operator".
If you haven't seen it before, do some Googling, and experiment some—it's definitely a good addition
to your Python toolbox. It is formally described
here.
decisions.txt
In our first class we briefly talked about some of the language design decisions embodied in Python's print
function:
print(1,2,3)
or be limited to a single value like
Java's System.out.println
?
For this problem you are to identify three language design decisions embodied in EITHER the three-argument form
of Python's range
, i.e., range(start,stop,step)
OR Python's str.split
method.
bench.txt
Your task in this problem is to devise a simple computation whose execution time depends on one parameter, then implement it in Java and Python, and then compare the performance of the Java and Python versions for three different cases of parameter values. In essence, you are to create a simple benchmark to explore the relative performance of Java and Python.
In general, benchmarking is a complicated undertaking that's packed with pitfalls and has an interesting cast of characters. However, we're just going to have a little fun with benchmarking. I'll start by showing you my solution for this problem.
Here's the computation I devised for my solution/implementation of this problem:
For some number N, first build a string of the form
"1,2,...,N,"
. Then print N, the total length of the string, the number of zeroes in the string and then the contents of the string, in single quotes.
If N is 10, here's the expected output:
N=10: 21 total chars, 1 zeroes, s: '1,2,3,4,5,6,7,8,9,10,'If N is 7,000,000, here's the expected output, with about 55 million characters elided from the middle:
N=7000000: 54888896 total chars, 4088895 zeroes, s: '1,2,3,4,5,6,7,8,9, 10,11,12,13,14,15,16,17,18,19...6999995,6999996,6999997,6999998,6999999, 7000000,'
Here is my Java implementation of the computation:
public class bench { public static void main(String args[]) { //int Ns[] = {1, 3, 10}; // small Ns for testing int Ns[] = {1_000_000, 3_000_000, 7_000_000, 100_000, 200_000}; StringBuilder sb = new StringBuilder(); int N = Ns[Integer.parseInt(args[0])-1]; for (int i = 1; i <= N; i++) { sb.append(i); sb.append(','); } String s = sb.toString(); int count = 0; for (int i = 0; i < s.length(); i++) if (s.charAt(i) == '0') count++; System.out.format("N=%d: %d total chars, %d zeroes, s: '%s'\n", N, s.length(), count, s); } }
Here is my Python implementation of the computation:
import sys #Ns = [1, 3, 10] # small Ns for testing Ns = [1000000, 3000000, 7000000] s = "" N = Ns[int(sys.argv[1])-1] for i in range(1,N+1): s += str(i) + "," count = 0 for i in range(len(s)): if s[i] == '0': count += 1 print(f"N={N}: {len(s)} total chars, {count} zeroes, s: '{s}'")
Let's time the Python version for the first case: N=1,000,000. (Examine the code to see how the command-line argument
1
is used to select the first element of the list Ns
.)
% time python3 bench.py 1 >p real 0m1.171s user 0m1.121s sys 0m0.025s
The first non-blank line of time
output, with real 0m1.171s
, is sometimes called the
"wall-clock" time—it's the amount of time you'd have seen pass if you'd been watching a clock on the wall. The
third line, sys 0m0.025s
, is the amount of time the operating system spent doing work on your behalf,
i.e., the time spent in "kernel mode".
The middle line, user 0m1.121s
is the line we're interested in. It shows how much time we spent in our Python code: 1.121 seconds. (User time does include compilation into Python bytecode, but it's pretty negligible.)
The redirection >p
causes the Python program's output to be sent to a file named p
(for Python).
Because the time
command is built into shells (try type time
), its output can
vary from shell to shell.
The above output is from Bash. Here's what we might see with zsh
:
% time python3 bench.py 1 >p python3 bench.py 1 > p 1.14s user 0.02s system 99% cpu 1.166 total
Let's time the Java program for the first case. Instead of just using java bench.java
to compile and run, we'll compile it first with javac
, since Java compilation time
is somewhat significant. (Try time javac bench.java
to see how long the compilation takes.)
% javac bench.java % time java bench 1 >j real 0m0.199s user 0m0.212s sys 0m0.060s
Comparing the user
times from the two runs, we see that the Java version ran 5.29 (1.121/.212) times faster than Python.
We should also be sure that the Java and Python versions produced identical output. We'll use cmp
for a yes/no test:
% cmp j p %A behavior common among UNIX tools is for them to produce no output if an operation was successful. (In 352 I say, "Silence is Golden".) Above we see no output from
cmp j p
,
so j
and p
are identical.
To see what cmp
shows if the files differ, let's compare bench.java
and bench.py
:
% cmp bench.java bench.py bench.java bench.py differ: char 1, line 1
Let's look at the sizes, just for fun:
% ls -l j p -rw-r--r-- 1 whm staff 6888949 Jan 16 13:50 j -rw-r--r-- 1 whm staff 6888949 Jan 16 13:52 p
If a cmp
fails, or I get a "diff" with diff
, I'll often first look at the sizes of the output files. One of the files might be empty, for example.
Let's have a shell script do all the work of running both Java and Python versions,
extracting timings, ensuring identical output from both versions, and more. It also runs
javac bench.java
if needed.
If this were 352, I'd have you write that script
yourself but since this is 372, I'm supplying it for you. It's in a2/run-bench
.
Assuming you've made an a2
symlink,
you can type cat a2/run-bench
to read the code, and I hope you do!
Let's run the script. We'll give it one argument, a 1
, to indicate we want to run the first case.
(That argument, 1
, is in turn passed along to java bench 1
and python3 bench.py 1
.)
% a2/run-bench 1 Java: 0m0.360s Python: 0m1.666s
Just so you know what it looks like, here's an example of what's shown if there's a run-time error in
bench.java
:
$ a2/run-bench 1 Java: 0m0.013s Python: 0m1.059s cmp: EOF on j which is empty =====Java errors===== Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: Index -9 out of bounds for length 5 at bench.main(bench.java:9)A Python error would be similar.
Timing can vary from run to run. A good practice is to run a benchmark several times
and see if there's any significant variation in the times reported. To facilitate
that good practice, run-bench
accepts a second, optional argument that indicates that multiple
runs should be made.
Let's run the first case three times:
$ a2/run-bench 1 3 Java: 0m0.339s Python: 0m1.817s Java: 0m0.342s Python: 0m1.745s Java: 0m0.366s Python: 0m1.691s
Sure enough, we do see some slight variation, 0.339-0.366 seconds for the Java version, and 1.691-1.817 seconds for the Python version.
We have one more tool: a2/bench-avg.py
. It reads run-bench
output, echoes it, and
then prints some statistics on performance. Let's "pipe" the output of run-bench
into bench-avg.py
:
$ a2/run-bench 1 3 | python3 a2/bench-avg.py Java: 0m0.353s Python: 0m1.728s Java: 0m0.444s Python: 0m1.789s Java: 0m0.409s Python: 0m1.822s Java was 4.43x faster than Python (0.402s vs. 1.780s for 3 runs)We see that the mean time for the three Java runs was 0.402 seconds versus a mean of 1.780 seconds for the three Python runs, showing a 4.43x speed advantage for Java.
Let's try seven million for N via case 3, and run it four times:
$ a2/run-bench 3 4 | python3 a2/bench-avg.py Java: 0m0.908s Python: 0m14.167s Java: 0m0.821s Python: 0m13.100s Java: 0m0.959s Python: 0m13.314s Java: 0m0.922s Python: 0m13.046s Java was 14.86x faster than Python (0.90250s vs. 13.40675s for 4 runs)
We see that for this benchmark, Java's advantage seems to increase as N grows.
For this problem you are to:
One of my Fall 2022 teaching assistants, Amy Paul, wrote this good advice about finding a computation:
Personally, I find that what makes a good benchmark problem is just something I find interesting. There are a lot of neat algorithms and programming tricks out there that I wouldn't normally think to use or try, so benchmarking is a good way to investigate those. What you decide to do for this problem is really up to you, but if you want a place to start I'd suggest just Google searching for interesting programming algorithms and go from there!
If your computation involves random numbers, note that you'll need to ensure that both the Java and Python versions produce the same sequence of random numbers. A simple way to achieve that is to write your own random number generator, and that can be pretty simple with a linear congruential generator.
If you're having trouble settling on a computation for your benchmark, mail to 372s23
and we'll give you some feedback.
Important: Your computation should produce output that provides some evidence that both versions are doing the same computation. For example, my computation prints the entire string that's built. If you do some kind of sort, printing the sorted values or the number of comparisons made would provide some evidence of sameness.
bench.java
and
bench.py
, respectively.
The values of N for my three cases are one million, three million, and seven million. The Python version takes about 1, 3, and 7 seconds respectively for each.
1
, 2
, or 3
.
That command-line argument selects which of the three values of N to use.
(If you haven't worked with command-line arguments, stop by during office hours and we'll get you squared away.)
There are three deliverables for this problem: bench.java
, bench.py
, and bench.txt
.
bench.txt
is a plain text file that should contain:
a2/run-bench 1 3 | python3 a2/bench-avg.py a2/run-bench 2 3 | python3 a2/bench-avg.py a2/run-bench 3 3 | python3 a2/bench-avg.py
bench.txt
in
a2/whm-bench.txt
.
quickrepl.txt
After working with languages that have a REPL available,
you may find yourself really wanting a REPL when you go to learn a language that doesn't have one.
One approach to a REPL is to integrate it with a language's implementation, but that typically requires a good understanding of that implementation.
Another approach is a completely standalone REPL, such as once provided by javarepl.com
, but that can require a lot of code—wc
showed about 8,000 lines in Java source files for that system.
For this problem your challenge is to come up with a quick approach to write a simple REPL for a language. Your quick REPL needn't be nearly as good as ghci
but it should be able to do the sorts of things you do when learning a language, such as evaluating expressions and showing types. Think about an approach that provides a lot of functionality for relatively little effort—something you could implement in a day, and not be late for dinner.
I'm not asking you to write any code for this problem; just describe how you might quickly produce a simple REPL for a language of your choice.
Python, Ruby and other languages have an "eval(s)
" function that executes the code contained in a string s
. Do not rely on such a capability in your quick REPL.
Note: This is a hard problem that requires some creativity and cleverness, so I'm making it worth zero points, but I'm curious to see who'll do it anyway! For this problem you are free to work in any groups of any size, but for me to possibly be impressed with your answer you'll have to specifically state that you came up with whatever you did all by yourself/yourselves—no Googling.
If you work in a group, all members may submit identical copies of quickrepl.txt
, but it should include a list of group members.
observations.txt
Submit a plain text file named observations.txt
with...
(a) (1 point extra credit) An estimate of how many hours it took you to complete this assignment. Put that estimate on a line by itself, like this:
Hours: 3.5There should be only one
"Hours:"
line in observations.txt
. (It's fine if you care to provide per-problem times, and that data is useful to us, but report it in some form of your own invention that doesn't contain the string "Hours:"
. Thanks!)
Feedback and comments about the assignment are welcome, too. Was it too long, too hard, too detailed? Speak up! I appreciate all feedback, favorable or not.
(b) (1-3 points extra credit) Cite an interesting course-related observation (or observations) that you made while working on the assignment. The observation should have at least a little bit of depth. Think of me thinking "Good!" as one point, "Excellent!" as two points, and "Wow!" as three points.
Note: I'm looking for quality, not quantity!
morefacts.txt quote.txt jshell.txt counterparts.txt ten-things.txt expressions.txt decisions.txt bench.java bench.py bench.txt quickrepl.txt (zero points—it's optional!) observations.txt (for extra credit)
Note that all characters in the file names are lowercase.
Do not include your name, NetID, etc. in your .txt
files—I like reading answers without knowing who wrote them.
Use a2/turnin
to submit your work. It creates a time-stamped "tar" file (similar to a zip file) that contains the deliverables listed above and then uses the system-wide turnin
program to actually turn in that tar file.
Assuming you've made that symbolic link, you should see something like the following when you run a2/turnin
:
% a2/turnin Warning: quickrepl.txt not found ======== contents of a2.20230116.223543.tz ======== -rw-r--r-- whm/whm 387 2022-08-25 16:53 morefacts.txt -rw-r--r-- whm/whm 134 2022-08-25 16:53 quote.txt -rw-rw---- whm/whm 1170 2022-08-25 16:53 jshell.txt -rw-rw-r-- whm/whm 851 2023-01-16 22:35 counterparts.txt -rw-r--r-- whm/whm 4294 2023-01-16 22:33 ten-things.txt -rw-rw-r-- whm/whm 58 2023-01-16 22:35 expressions.txt -rw-rw-r-- whm/whm 420 2023-01-16 22:35 decisions.txt -rw-r--r-- whm/whm 685 2023-01-16 22:34 bench.java -rw-r--r-- whm/whm 320 2023-01-16 22:34 bench.py -rw-rw---- whm/whm 1917 2023-01-16 22:34 bench.txt -rw-rw---- whm/whm 483 2022-08-25 16:53 observations.txt ======== running turnin ======== Turning in: a2.20230116.223543.tz -- ok All done.Notice the first line of output:
"Warning: quickrepl.txt not found"
. It indicates that the deliverable
quickrepl.txt
was not found. We'll imagine that because that quickrepl.txt
is a zero-point problem the student elected to not do it.
The contents of the "tar" file that was created, named a2.20230116.223543.tz
, are shown after that warning. Finally, the system-wide turnin
program is run and its output is shown—we can see that the
tar file with the deliverables was turned in ok.
You can run a2/turnin
as often as you want. We'll grade your final non-late submission. We can add a -ls
option to a2/turnin
to show what's been turned in:
% a2/turnin -ls
.:
total 17
-rw-rw---- 1 whm cs372s23 1065 Jan 16 14:12 a2.20230116.141247.tz
-rw-rw---- 1 whm cs372s23 3432 Jan 16 22:35 a2.20230116.223543.tz
If you're curious, and I hope you are, look at a2/turnin
.
Remember that late submissions for assignments are not accepted, and that there are no late days; but if circumstances beyond your control interfere with your work on this assignment, there may be grounds for an extension. See the syllabus for details.
Note that because each run of a2/turnin
creates a new file, it can be used for a low-tech backup/snapshot of your work.