All milestones must be met for full credit.
Item | If you select from the Implementation Track: | If you select from the Paper Track: | Grade |
---|---|---|---|
Checkpoint 1 | Selection of one project types and initial list of resources | Selection of topic and initial list of papers | Ungraded |
Checkpoint 2 | Initial report discussing progress, hurdles, and other challenges; and git repo containing at least three non-trivial/starter commits | Complete first draft, including citations | Ungraded |
Checkpoint 3 | Feedback/suggestions on 2 other students’ initial reports | Peer review | 10 points |
Final Submission | Completed code, full git repository, completed writeup, and document summarizing how the feedback was used | Completed final paper + summary-of-changes document | 40 points |
Final grade | = total / 50 |
For the graduate assignment, choose ONE of the following assessments. The possible assessments are divided into an “Implementation Track,” and a “Paper Track”. Both tracks consist of advanced topics that we may very lightly cover, or mention, in the course. However, we will not spend significant time on any of them. The Implementation Track will require you to implement and adapt NLP techniques. For the Paper Track, you will write a 4-page conference-style literature review paper on a particular topic. While each person is free to choose which of these two options they will complete, I encourage each person to strongly consider how each may be able to further their career/degree goals and choose the assessment that maximizes progress toward that goal. For example, someone writing a disseration (doctoral or masters) may wish to do the literature review paper, since it could help form the basis of the related work chapter of a dissertation. On the other hand, someone may wish to really dive deep into an aspect of NLP implementation.
Implement your proposed solution from HW 1 for one of the three parts (a, c, or e):
a. The user types in a keyword that they’re interested in, and the app finds relevant textbooks. c. The app displays a question relevant to the chapter. e. The app gives a numerical score for how well the user answered the question.
Once you select which problem you will do, the rest of the project will include:
At the end of the semester, you will submit:
Accounting for items like tables and graphs, an appropriate target length of this writeup should be approximately 2-4 pages. Within the Implementation Track, you will be primarly evaluated on the completeness of your implementation. However, the thoroughness and clarity of the writeup will be a sizeable (but non-majority) portion of your grade.
Milestone 1: Selection of Option. Select which of the options (a,c,or e) you will do and fill out the short form online. Write a short description of your initial idea for how you might implement it.
Milestone 2: Initial draft of writeup, and git repo of your code with ≥ 3 non-trivial commits. You must submit two items: an initial version of your writeup, and a git repository of your implementation containing at least three, non-trivial commits (e.g., commits affecting more than just whitespace or comments). Your writeup does not need to include results, but it does need to include a discussion of your proposed system; a discussion of your implementation (or expected implementation), including any hurdles you are currently encountering; a discussion of what data you are using (+ if and how you adpated it); and a discussion of what tests you are running or will run to ensure correctness.
You will turn in:
Milestone 3: Feedback on Discussion. You will receive up to two other students’ initial writeups; you must provide feedback on the breadth, depth, and clarity of exposition. You may also provide suggestions on hurdles that are described in the writeup(s). Reviewing forms and guides will be provided. To receive full credit for the reviews, you must provide constructive and civil reviews (a guide will be provided). This feedback will be “double-blind:” as a reviewer, you will not know whose writeups you are reviewing, and as an author, you will not know who your reviewers are. This is why it is important for the Milestone 1 drafts to be anonymized. All paper-reviewer identities will be known to course staff.
Final Writeup and Full Code. This must be a complete, well-written writeup. You will turn in:
(GA4) Write a literature review paper in which you select one of a set of topics, and (i) identify, (ii) analyze,and (iii) synthesize modern approaches for the topic you choose.
Here are some examples of literature review papers (also called survey papers) that have been published at conferences on more specific NLP topics. Your paper should be like these, although it doesn’t have to be quite as thorough.
Select one of the following topics (see Paper Topics), and (i) identify, (ii) analyze, and (iii) synthesize modern approaches for the topic you choose.
Identify. For this assignment you will need to find an appropriate number of papers to discuss in detail. Though the final number that you select is highly dependent on, among other things, which topics you choose, the length of the papers, and their venues, a reasonable number of papers is between five and ten. This range does not constitute required minimums or maximums. You may read many more papers than you discuss in detail. Do not view this as “wasted” effort—these should help inform the overall narrative and context for your discussion.
Analyze. Ask and answer fundamental research questions: what were the goals of each of the papers? What scientific and engineering questions did each of the tackle? How well did the evaluations support the main claims? What was not done that could have been done?
Synthesize. How do the efforts relate to one another? Do they follow one after another, making (incremental) progress on a task (metric)? Does one question some basic assumptions of another, and if so, how do the other papers fit in? What are the limitations of these approaches, and what still remains to be done? You can also link these papers and ideas to related fields.
Within the Paper Track, you will be primarly evaluated primarily on the completeness, thoroughness, and clarity of your paper. Grammatical, logical, organizational, or factual errors will be negatively impact the score. Weak or lacking analysis and synthesis will also be large negative influences on the score.
Papers should be four pages, not including references, in the ACL format. Paper must use the ACL style guide (either LaTeX or Word is fine). Be sure to cite appropriately and follow all academic honesty standards. You may include figures (your own, reproductions, or copies of existing figures); be sure to provide appropriate credit for the figures. However, make the figures count: do not include them simply to pad the paper. Do not consider just “recent” papers; try to find papers from the past 25 years.
Some good places to find papers include
Or if you’re interested in speech:
You can also find papers using Google Scholar if you know the right keywords.
Milestone 1: Selection of Option, Topic, and Initial list of papers. Decide on a topic and fill out the short form online. It would also be beneficial if you find at least 5 papers related to your topic. If you list them in the form, I can give you feedback regarding their relevance to your topic and their quality. This list is not your complete or final list of papers you’ll read or consult: it is meant as a starting point. You may also remove papers from this list when you actually write your paper.
Milestone 2: Initial version of the paper. Despite it being “initial,” this must be a complete, well-written paper. Although this part is ungraded, the better this draft is, the more informative the feedback you’ll get from your peers.
You will turn in:
Milestone 3: Paper Peer Review. In this process, you will receive up to two other students’ papers; you must provide feedback on the breadth, depth, and clarity of exposition. Reviewing forms and guides will be provided. To receive full credit for the reviews, you must provide constructive and civil reviews (a guide will be provided). This review will be “double-blind:” as a reviewer, you will not know whose papers you are reviewing, and as an author, you will not know who your reviewers are. This is why it is important for the Milestone 1 papers to be anonymized. All paper-reviewer identities will be known to course staff.
Final version of the paper. This must be a complete, well-written paper.
You will turn in:
Please select a topic from the ones listed below. With consultation of the instructor, you may propose your own topic.
For this topic, you will examine advanced and/or hierarchical approaches to language modeling.
For this topic, you will examine how non-language signals (e.g., image or audio features) can help NLP tasks, how NLP tasks/models can improve understanding/analysis of those non-language signals, or both.
For this topic, you will examine structured prediction for a single task, or a significant, relevant aspect of that task. Roughly, structured prediction is any task that given an input, produces some object or label with an internal structure, such as a syntax tree or semantic frame.
For this topic you will examine how computational/statistical models are developed to better explain (or mimic) linguistic phenomena/subfields. For example, you could explore computational approaches to phonology, morphology, syntax, semantics, or pragmatics—or any combination, e.g., morphosyntax, syntactic- semantic, phonology/morphology, typology, etc.
For this topic you will examine methodological and/or evaluation approaches for generating natural language. Classic examples of natural language generation include machine translation and abstractive summarization. There’s an entire SIG (special interest group) on generation (SIGGEN) and a conference (INLG) devoted to it.
For this topic you will explore ethical concerns (and approaches for dealing with them) in NLP, and/or issues of implicit/explicit bias in NLP models. You might want to check out the FAccT conference proceedings if you’re interested in this topic.
For this topic you would survey how NLP can be used in an area of study of interest to you. For instance, there are special interest groups (called SIGs) for NLP for the humanities, Semitic languages, and biomedical applications—among many others. Look at the “SIGs” row at to bottom of the ACL Events table at https://aclanthology.org/.
Unsatisfied with any of the above options? Then feel free to pick your own topic. The requirement is that you must clear it with the instructor first and it must have a significant relevance to material covered in this course.
What types of sources that you should cite in your paper?
Assignment adapted from Dr. Frank Ferraro.