Speech:Spring 2019 Proposal

From Openitware
Revision as of 14:20, 28 February 2019 by Dmc1049 (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search


Contents

Introduction

The goal of the Spring 2019 Capstone project is to build off of the work of the previous classes, with a focus on maintenance and documentation to improve the overall functionality of the speech recognition project. To that end, the class is split into 5 groups with each group specializing in a certain area of the project to ensure that all areas of concerned are properly addressed. Each group has researched the previous classes progress (respective to their group) as well as looked into the current issues that are present in their field and have chosen areas of focus to further enhance the progress of this project.


Acting in correlation with previous semesters, the class has been divided into 5 group with the following members:

Modeling Group
Data Group
Experiments Group
Software Group
Systems Group
Christian
Aashirya
Brooke
Adam
Donald
George
Brandon
Dilpreet
Anthony
Naina
Kevin
Monica
Ethan
Travis
Scott
Vladimir
Peter
Nicholas
Wesley

The tasks that each group has chosen to undertake reflect an eagerness for both enhancing the progress that has been made with past groups while focusing on the overall reliability and maintenance of their group respectively. This goal is set to allow future classes to be able to more easily grasp the material in the project as well as further enhance the efficiency and progress of the project.

To obtain this objective, each group has a specific goal in mind related to their respective group. Modeling will focus on achieving a better Word Error Rate (WER) than the class of 2017's rate of 41.3% by improving the implementation of the Linear Discriminant Analysis (LDA) and Recurrent Neural Network (RNN). Data's plan is to look into improving the experimentation process by adding a one hour and three hundred hour corpus, and to complete the work of the past class. For the Experiments group, they are looking to create and improve the scripts needed to run experiments to make the process more seamless as well as to provide documentation on the steps they made as well as to review and revise the documentation of the past groups. Software will be looking into the speech tools that are used and whether it would be beneficial to update any of the tools along with looking into the current status of Torque. Lastly, the Systems group is working to improve the reliability of the system by getting the backup server operational and fixing hardware/bios related issues present on a few drones.


The deadline to achieve the majority of the assigned tasks is before the team competition. As there will be little time to focus on these tasks once the team competition commences and each team must prioritize their objective to outpace the competing team. To that end, each group has created a timeline to be able to achieve specific objectives by a set period of time to be able to mitigate the impact of the team competition.

Modeling Group

Overview

The modeling group is responsible for implementing new modeling techniques in order to improve overall performance. This will be done by continuing and bolstering the techniques researched and/or implemented in previous semesters. Our focus this semester will be to improve the current implementation of the Linear Discriminant Analysis (LDA) and Recurrent Neural Network (RNN) into the HMM and to continue with the Spring 2018 semester's proposed theory to reconstruct the phonetic spellings in the dictionary. We plan that this will achieve a WER score lower than the Spring 2017 semester's score of 41.3%, ideally, a reduction to a WER score less than 40%. The previous semester (Spring 2018) attempted this but was unable to obtain a WER score less than 41.3%. We hope that continuing with and improving upon the previous semesters' work on LDA and phonetic spelling reconstruction, along with continuing and improving the RNN implementation from the Spring 2017 semester, we will be able to do so.

Objectives

Linear Discriminant Analysis (LDA)

Linear Discriminant Analysis (LDA) is a pre-processing technique that reduces the number of dimensions the data is the represented on. It does this by maximizing the distance between the means of each category while minimizing the variation within each category of data. What this creates is a new axis for data to be populated on where there are distinct separations between data of different categories. The previous semester (Spring 2018) was able to find and fix the problems with the Spring 2017 semester's implementation of the LDA into the HMM; their fix caused for a successful implementation of LDA. Our plan is to continue where the previous semester left off. We plan to work closely with the Data group in order to see how different representation of the data will affect the LDA, by possibly increasing the number of categories and/or dimensions.

Recurrent Neural Network (RNN)

A Recurrent Neural Network (RNN) is a neural network in which memory from previous inputs is essentially stored, meaning that that the output of a specific neuron is also going back in as input to get the next output. RNN's have grown to be popular for machine learning situations where the data is changing overtime and/or an output is dependent on its previous input. The last semester to implement this technology was the Spring 2017 semester. They implemented a RNN in order to incorporate it into the language model to generate random words and then using the results to train the Sphinx decoder. However, the RNN did not prove to be more effective in improving performance. Current research does show that RNNs are popular with speech recognition and similar technologies. Therefore, our plan is to further researcher the topic and then continue where the Spring 2017 left of and attempt to implement the RNN in different ways, potentially using it for different purposes aside from generating random words to train from.

Phonetic Spelling Reconstruction

The previous semester (Spring 2018) Avenger team, towards the end of the semester had began to reconstruct the phonetic spelling of dictionary words. They original came across this idea from first finding problems within the previously used dictionary, some words or expressions were not being phonetically expressed in a way an actual person would speak. They began to add more phonemes to the dictionary and have the dictionary use them correctly. From there they began rewriting many of the phonetic spellings for dictionary words and providing multiple different ways to phonetically express a single word. This started to show positive results for them but the were unable to finish the implementation of it. Our plan is start from were they left off and attempt to fully implement their idea of reconstructing the phonetic spellings in the dictionary.

Documentation

Documentation will be a consistent priority for our group. It will be pertinent to document all experiments and research accurately in order for future development to have adequate information. Our plan is to document all experiments properly on the wiki, including detailed overviews and results summaries for each experiment. All research done will be detailed in the individual's weekly log in which it will include the actual information learned and provided details on there assessment on how it affects the project.

Task Timeline

Familiarization with project, determine roles and responsibilities

  • Setup connectivity with Caesar, drones, and the wiki (1/29 - 2/5) - Group
  • Successfully run a Train, Decode and Scoring (2/5 - 2/12) - Group
  • Discuss and uncover initial plans for semester goals (2/5 - 2/12) - Group
  • Draft modeling section of proposal (2/5 - 2/12) - George
  • Finalize proposal and delegate specific tasks to group members (2/12 - 2/19) - George

Linear Discriminant Analysis Implementation

  • Research LDA (2/19 - 3/5) - George & Kevin
  • Implement previous semester's LDA methodology (2/26 - 3/5) - George & Kevin
  • Research and experiment with new techniques to manipulate LDA (3/5 - 4/2) - George
  • Research and experiment with different ways to adjust the data to enhance LDA (3/5 - 4/2) - Kevin
  • Implement new LDA methodologies (4/2 - 4/23) - George & Kevin
  • Finalize documentation of all findings (4/23 - 5/7) - George & Kevin

Recurrent Neural Network Implementation

  • Research RNNs (2/19 - 3/5) - Christian & Vladimir
  • Implement Spring 2017 semester's RNN (2/26 - 3/5) - Christian & Vladimir
  • Research and experiment with different aspects to improve upon RNN (3/5 - 4/2) - Vladimir
  • Research and experiment with different ways to utilize a RNN (3/5 - 4/2) - Christian
  • Implement any newly discovered methodologies (4/2 - 4/23) - Christian & Vladimir
  • Finalize documentation of all findings (4/23 - 5/7) - Christian & Vladimir

Phonetic Spelling Reconstruction

  • Create a 30 minute - 1 hour train scenario (2/19 - 2/26) - George
  • Research Spring 2018 semester's technique for phonetic spelling (2/26 - 3/19) - George & Christian
  • Create a new dictionary (2/26 - 3/19) - Christian
  • Uncover new phonetic expressions (3/19 - 4/2) - George
  • Manipulate phonetic spellings of words in new dictionary (4/2 - 4/30) - Christian
  • Finalize documentation of all findings (4/23 - 5/7) - George & Christian

Data Group

Overview

For spring 2019, the Data Group is comprised of four group members: Aashirya Kaushik, Monica Pagliuca, Brandon Peterson, and Peter Baronas. The group is responsible for making sure that the data to be used for training and testing the acoustic models (AM) and eventually the language models (LM) is actually valid data, that there are no errors in the data that is being used, and all the data is organized in an easy to use manner. Although work performed in prior years has shown progress toward this goal, work remains to ensure annotations are correct and useful and that the data sets are error-free. Effort will be focused on completing experimentation that was started by last year's Data Group to determine if and what type of annotation in the transcripts is most beneficial to the word error rate (WER). Another task to be tackled is to create a shorter corpus of one hour in length. This shorter corpus will be useful to students as a first experiment for the sandbox as it will enable a faster test and not take multiple hours merely to generate. When developing any smaller test sets, it is critical that such sets are representative of the actual data sets, so that results are legitimate and the testing is productive. After development smaller test sets, the modeling group is tasked with performing the testing with those sets. To achieve this objective, it is important to find a way to make sure that audio samples that are used in trains do not get reused in unseen data tests as well as to make sure that there are no errors in the system to begin with. Finally, the most important task is to develop a 300-hour audio training set with a 10-hour test set, where the test set is completely clean. Prior work has ensured that there are no utterances in the test set that also exist in the training set. However there has not been effort to ensure there are no individual speakers that are present in both the training and test sets. Having a common speaker makes the training set inherently unclean. To ensure no speaker is in both sets, the new 300-hour corpus (300hrb) will have 10 hours extracted from it based on conversations, not based on the sequence of speakers. The 10-hour test set will divided into two 5-hour sets to serve as a development test and an evaluation test.

Objectives

Annotation Experimentation

Experimentation is necessary to test whether or not different types of annotations in the transcripts help or hindered the acoustic model and the word error rate. Prior experimentation was performed using 5-hour data sets but the documentation indicates that the results were inconclusive. This experimentation will be repeated with the 5-hour sets to verify the results and ensure the testing process is consistent. Then, once conclusive results have been obtained with the small scale data set, larger, 30-hour sets will be applied to confirm whether or not the change to the annotation improves the word error rate for more than just the test samples.

Creating Shorter Corpora

A short corpus of one hour in length will be developed so models can be created and tested more quickly. Currently, the shortest test set is five hours long which requires at least two hours of processing to generate some of the basic files. In addition to reducing the duration of test cycles, the process of creating a new corpus provides important experience and familiarity which is beneficial for subsequent tasks.

Creating New 300-Hour Corpus

The most critical project is to generate a new 300-hour corpus that splits the full corpus into 300 hours of training and 10 hours of testing in a manner that ensures the integrity of the unseen testing. If some of the data used for testing was also used in training, it could lead to biased results and a falsely low word error rate, so no speaker can be present in both as this would be like sneaking some of the answers into an exam on a note card. The data is divided into conversations so one approach is to perhaps split the data by conversations to ensure no overlap. Another approach would be to simply ensure no speaker present in training appears in testing and vice versa. Significant experimentation will be necessary to determine the best approach. Also, a method must be developed to consistently grab semi-random audio files without creating breaks in the middle of conversations.

Documentation

All of the changes made to scripts and to data, as well as any changes made to file architecture, will be logged in both the individual logs of the group members, as well as changes to the appropriate sections in the wiki. For instance, changes made to corpra or any additional explanatory information will be documented in the appropriate section. Also, the guides on how and why the different types of corpra were made will be expanded and updated.

Task Timeline

Creating Shorter Corpora

  • Generate a 1 hour Corpus starting (1/7 - 2/11) - Brandon
  • Create 1 hour testing transcripts within existing Corpra (2/11 - 2/18) - Brandon
  • Test the new Corpora starting (2/18 - 2/26) -Brandon

Annotation Experimentation

  • Run a 5hr experiment on Spring 2018’s 0305/013. (2/6 - 2/19) – Monica
  • Run a 30hr experiment on Spring 2018’s 0305/013. (2/19 - 2/26) - Monica
  • Run a 5hr experiment on Spring 2018's 0305/011 and 0305/012 (2/6 - 2/20) - Aashirya
  • Run a 30hr experiment on Spring 2018's 0305/011 and 0305/012 (2/20 - 2/29) - Aashirya
  • Determent Best annotation (2/29 - 3/12) - Monica & Aashirya

Creating New 300-Hour Corpus

  • Create SQL database with information on speakers for each conversation so it can be read easier (2/18 - 2/25) - Monica
  • Determine type of isolation (2/15 - 3/4) - Peter
  • Determine new script for sampling by determined method of isolation (3/4 - 4/1) - Brandon
  • Create corpora with new isolation and script (4/1 - 4/29) - Brandon & Monica

Experiments Group

Overview

As the experiment group we aim to create, improve, and expand on the scripts that make running experiments, trains, and decoding scripts seamless. In doing this, we will be giving the other groups a much more fluid system from which to build progress upon. Maximizing efficiency and reducing redundancy for the entire speech project workforce will propel us forward as a group, and simplify our means for success. Our primary points of concern begin with making adjustments to the Add and Copy Experiment scripts to help eliminate bugs and warnings, as well as adding features to improve user functionality. The Add-Experiment script remains one of the most commonly used scripts and should be continually updated as the project evolves. The Copy-Experiment script, on the other hand, is an underrated script which we feel is not being utilized to it's full potential. Giving this script the option for cleaner output might entice other group members to use it more often. In addition to updating existing scripts, we will be generating two new scripts known as the Make-Experiment Script and the Running Jobs Script. These two scripts will allow us to integrate the Make-Train and Make-Decode scripts, and capture all the training and decoding jobs running on our machines to display who is working on what and when. These two improvements could could significantly make running experiments simpler much more organized. Finally, even though we will be creating documentation as we make progress, we will also be reviewing, updating, removing, and rearranging all of the documentation as needed. Some old documentation as well as scripts or project resources are still listed under the wiki page and might cause confusion for current or future groups. It is important that we keep the documentation as detailed and thorough as possible, but keeping it lean and free of expired information is just as important to reduce false leads and wasted time.

Objectives

Fix existing issues with Add-Experiment Script

The Add Experiment script is one of the most utilized scripts in the entire speech project. As such, we would like to see it working to its maximum potential. To start, some debugging is necessary. In its current state, if you create a root experiment it forces you to create a sub-experiment, which you then later have to remove from the wiki by hand. This is an unnecessary step that can, and should be removed. Also, Add-Experiment currently has restrictions as to what you can and cannot type as a name for your experiment such as trailing white spaces and special characters. Small nuisances like this can still complicate formatting and diminish simplicity. Modifying the current code to handle these special characters would prove to be beneficial in the long run for better file organization.

Implement quality of life improvements for Add-Experiment Script

After making those adjustments, we would like to add a couple of features as well. Implementing a feature that automatically recognizes wildcat user id's when you run the Add-Experiment script simplifies the process so that the only information the system asks for is your wildcat password. You are already logged into Caesar with your username, so it should be able to grab your tag rather than asking you for it. We would also like to develop a feature that not only adds the information to the wiki, but also creates the directory in /mnt/main/Exps for you. This would lessen redundancy and eliminate the possibility of the directory not being created.

Address issues with Copy-Experiment Script

The Copy-Experiment script is another commonly used script which, as it stands, is a great program but could still use some improvements. Currently, when you run the program it copies absolutely everything, including all the prior models that were created. We would like to make this an option that allows users to decide if they want this extra information or not. The user would simply append an (-a) for all information, or (-s) for a skeleton version that contains only a summary of the most important information needed to run the current experiment. Displaying only the bare bones of the essential data generated from Copy-Experiment would be a huge step in advancing our experiment data organization. In addition, there are issues with the regular expressions used in copyExp.pl which causes the script to incorrectly modify configurations which we will look to fix.

Create new Make-Experiment and Running-Jobs scripts

A couple of things that we believe would be both beneficial and attainable within the scope of this semester would be to create two new scripts; one to combine certain scripts that are usually run together into one execution, and one to keep track of who is running what experiments. Two scripts that are known to be run together are the makeTrain.pl and makeDecode.pl scripts. Mending these two together into one script called makeExp.pl can make execution much more efficient and save time. The second script would keep a log of who is running what tests and when. This script, spyExp.pl, would be able to capture all the training and decoding jobs running on our machines and display the results in a table for anyone to see. The results captured would include things like the program being run, the experiment number, the user, and the machine it was run on.

Update documentation

We will not be the last year of capstone students looking at these scripts, therefore it will be important that we leave behind appropriate documentation on each of the scripts so people in the future won't be scratching their heads on what we did. Further, we will make an effort to ensure all existing documentation is up to date, including the Scripts page is up to date.

Task Timeline

Analysis

  • Review and gain familiarity with addExp.pl script (2/4 - 2/18) - Group
  • Review and gain familiarity with copyExp.pl script (2/4 - 2/18) - Group
  • Learn Perl and our production environment (2/4 - 2/18) - Group

Improve Add-Experiment Script

  • Address naming issues (3/5 - 3/17) - Nick
  • Create directories under /mnt/main (2/25 - 3/4) - Ethan
  • Address use of this script on drones without an internet connection (3/17 - 4/1) - Nick
  • Implement an auto-fill feature for Wildcats username (3/4 - 3/25) - Brooke

Improve Copy-Experiment Script

  • Root cause analysis on currently known issues (2/18 - 3/4) - Nick
  • Address incorrect changes to configs with directory structures like /mnt/main/Exp/sp18/0305/015 (2/11 - 3/5) - Nick
  • Integrate -s and -a flags to copyExp.pl command to control amount of data shown when ran (3/4 - 3/25) - Ethan
  • Troubleshoot issues with regular expressions portion of copyExp.pl (3/4 - 3/25) - Ethan

Create New Running-Jobs Script

  • Create list of requirements for running jobs script (2/18 - 2/25) - Ethan
  • Research similar scripts that could help speed up the process (2/18 - 2/25) - Brooke
  • Generate code for Running-Jobs script with the name: spyExp.pl (2/25 - 3/25) - Ethan Brooke
  • Create detailed documentation for spyExp.pl script (3/25 - 4/4) - Ethan Brooke

Create New Make-Experiment Script

  • Create list of requirements for Make-Experiment script (2/18 - 2/25) - Dilpreet
  • Research similar scripts that could help speed up the process (2/18 - 2/25) - Dilpreet
  • Generate code for Make-Experiment script with the name: makeExp.pl (2/25 - 3/25) -Dilpreet
  • Create detailed documentation for makeExp.pl script (3/25 - 4/4) - Dilpreet

Improve Documentation

  • Make sure existing documentation is up to date (3/25 - 4/4) - Brooke
  • Remove any documentation or scripts that are expired or no longer needed (3/25 - 4/4) - Ethan
  • Update and reorganize documentation to reflect our changes (3/25 - 4/4) - Dilpreet

Software Group

Overview

The main goal of the Software Group is to ensure that the software needed for each group to perform their tasks, are running as intended. We need to review each of the pieces of software to determine what versions we are running, what the latest versions are, and why we should or shouldn't update them. The categories of tools we need to focus on are the speech tools (Sphinx trainer, Sphinx decoder, CMU LM toolkit, Sclite scoring tool) and the auxiliary tools(g++, Python math libraries, Emacs editor, Sox audio tool, Perl) Another topic we will be looking into is the status of the Torque software. Torque is used to parallelize training runs which could be extremely useful to our project members. We need to check the current status of this in order to get it to a usable state. One last piece of the puzzle is to determine how to go about cleaning up the Sphinx installation without interrupting or interfering with anything that is currently in use. We need to figure out which installation files we can remove safely and which ones are actively being used. As we go through each of these tasks, we will ensure that everything is up and running properly for our project team.

Objectives

Cleanup of Sphinx

The cleanup of Sphinx will first involve our team becoming familiar with what Sphinx is and how it works. We will need to continue our research on Sphinx by going through previous years logs and projects. It is known right now that there are multiple installation files and various other files of Sphinx that are laying around not necessarily being used. We will mainly focus on the Sphinx base libraries to check for duplicates. We will come up with an organized way to go through scripts to find out which paths are being used. It is our goal to go in and determine what is being used and what isn't. By doing this, at the end of the semester the duplicates and not needed files can be removed.

Versions of installed software

There are many pieces of software that make up Caesar. We need to go over each one and make sure we understand what each is doing and how they interact with each other. One piece of software may rely on another piece of software to work properly. We will start be studying each piece of software to understand it, creating a list of each version number, and the benefits or downsides of updating them. We will then present our findings to Professor Jonas before going through with any updates. Once approved by the Professor, we will update the software accordingly.

Status of Torque

Parallelization is the ability to run training in parallel on multiple processors. This is important because it would reduce the time required to run trains that would typically take a while to complete. We will need to review the documentation from previous years to get an idea of when it was last touched and what the current state of it is. It is believed that the drone Rome is the main device used for Torque but the processes are dispersed to other available drones to help speed up the experiment processes. Once we get a full understanding of how Torque works and what is required to get it working, we will see if we can get it to a usable state for our project members to use. To start this, we will set up a mini queue and use several of the drones along with Rome as the main one. When Torque is up and running, we can then think about any improvements or possible updates we can make to it.

Documentation

With the three main tasks we have at hand, we will be documenting our processes every step of the way. For the versions of installed software, we will be documenting what the current versions of software on Caesar are, what the newest versions are, and any benefits and/or downsides of upgrading with extensive reasoning. For the status of Torque, we will be documenting how Torque works, our steps for testing and setting up the queues on the drones including Rome, and details on our tests to show how each step of the process went. For the cleanup of Sphinx, we will document the location of each library and how we accessed them, how we ran tests to gather the needed information, and what our findings are for each test. Once information is gathered each week, we will document it accordingly in our logs and on the wiki.

Task Timeline

Cleanup of Sphinx

  • Figure out the best way to research which Sphinx libraries are being used and begin the process(2/12 - 3/5) - Adam
  • Document the findings in an organized list. Start determining which libraries and installation files are not being used (3/5 - 4/2) - Adam
  • Put together a final list of what is being used and what is not being used. Prepare to present findings to Professor Jonas (4/2 - 4/30) - Adam

Versions of installed software

  • Start gathering info from Caesar regarding each piece of software. Put in a detailed list. (2/12 - 3/5) - Anthony
  • Find out the possibilities of upgrading each software and what the outcomes would be. (3/5 - 4/2) - Anthony
  • Document final notes for each piece of software and our decisions on upgrading. Present this to Professor Jonas and document what the next steps are for over the Summer. (4/2 - 4/30) - Anthony

Status of Torque

  • Figure out the current status of Torque and what is need to get it up and running again. (2/12 - 3/5) - Travis
  • Set up a mini queue using Rome and several other drones. Document testing process. (3/5 - 4/2) - Travis & Anthony
  • Get Torque to a usable state and have the team test. Document latest information on it so next year can continue with it. (4/2 - 4/30) - Group

Systems Group

Overview

The primary goal for the Systems group is to ensure the overall health of the Caesar server along with the drones connected to it. Previous System groups have focused on upgrading the hardware and updating the Software tools. However, the most recent group in 2018 had focused more on Stability, and we mirror their concerns in this area and we believe it requires our immediate attention and needs to be addressed to ensure the health of the system. Currently, the backup server is not working properly and there are multiple issues that have been detected that appear to be related to the bios settings in several drones. Additionally, there is a need to improve the available resources in the system to ensure that data can be stored and experiments can be running on the server without over allocating the systems resources. To address these problems and to improve overall system functionality, our efforts will focus on a combination of system stability and expanding system resources.

Objectives

Improve Backup Process

The current backup server is not correctly backing up data as it should and requires are immediate attention to address. We will first need to investigate any errors we find with the backup server and address them accordingly. Once the backup server is functioning as we expect it to, our next step is to do a full system backup to ensure all the data on the server is recovered in case of a system failure. Once the data is backed up, we will plan to schedule routine backups to ensure that the data is being backed up on a regular basis. The data that we choose to backup on a regular basis will only include new data and not rewrite old data to ensure that the backup process is as efficient as possible. In preparation to having a scheduled backup process, we plan to stage the initial backup during spring break (assuming the present issues have been addressed by that point) to negate the potential performance impact that Caesar may suffer during the backup process.

Server Maintenance

We have discovered that the current servers do not have their times set correctly, nor are they being properly synced as they should. The first plan of action is to manually set all the servers to the correct standard time before proceeding to explore the possibility of automation. In the process of manually setting the times, we will explore the issues with the three servers that are experiencing issues. From an initial inspection of the drone servers; Obelix, Automatix and Miraculix appear to have problems related to their bios settings. The problems currently known are; a low powered PERC card battery in both Miraculix and Obelix, along with an inability to detect a monitor after boot up for Miraculix. Furthermore, all the listed drones had trouble booting into Red Hat without human intervention. To ensure all three servers are booting properly without human interference, we plan to investigate the bios settings of all these servers and make the appropriate changes that are necessary along with replacing the battery in Miraculix and Obelix. After manually setting the times of the servers not setup correctly and addressing the problems in the drones, we plan on syncing the servers together (all drones and Caesar) and link them to a trusted time server using the Network Time Protocol (NTP). In anticipation of setting up NTP for the servers, we will have a discussion with the professor to outline the potential benefits and drawbacks there are to setting up the process before proceeding.

Normalize Operating Systems

To ensure that all the drones and Caesar itself are identical on an operating system level, we plan to investigate the version of each drone running on the system. Once we know what version each drone is running off, we will conduct research on how the OS version may affect the programs running on the drones and report on the positives and negatives of continuing to operate on the current versions present on each drone. To further verify that all of the drones are functioning the same, we will run a five hour train and decode on each drone to verify that each drone is functioning properly and outputting results inline with the other drones.

Improve System Resources

The current resources available on Caesar could be improved upon and further enhance the processing capability of the system itself. Furthermore, given that there is a spare server available to be added, we believe we should install the new drone to improve the processing resources of the system. To make the necessary changes, we will investigate procuring memory for the new server, since it presently does not have any and will need memory to function. We propose that the class procures 6 sticks (6 is needed for triple channel memory to function properly) of ddr3 memory of ram in the same capacity as the drones presently running on Caesar so that the new drone server can be installed into the system. From our current research, it will cost $50 to buy the necessary memory to get the server operational. If our objective is achieved we would like to improve the power stability of the server itself by obtaining a UPS if possible, though cost may prohibit such procurement.

Server Configuration

We find it necessary to investigate the current configuration of the server and determine if there is a parent child rights issue present on the server. This will be done by first conducting research on possible parent child right issues that are present on Linux servers. Then based on the current configuration of Caesar, we will determine if such an issue is present or not. If it is found that the issue is present, we will look into possible ways of resolving the problem. Before making any changes to resolve this issue if it is present, we will first discuss the issue and possible resolution with the professor about the benefits and drawbacks of making such a change.

Documentation

In the process of getting the backup running, we will be adding additional documentation about the steps we take it fully operational and we will revise the current documentation for the setup process as well as how to properly diagnose certain issues. Furthermore, we will be documenting the OS versions of each of the Drones and note any potential issues that are present and possible resolutions for any issues discovered related to the OS version. If necessary, we may need to document the process of setting up fixing any parent child relationship conflict that may be occurring within the servers. We will also add documentation on any improvements made to the current hardware, such as the addition of the drone. Lastly, we will improve upon documentation on how to properly diagnose problems detected on the server (system outages, error messages, etc.).

Task Timeline

Improve Backup Process

  • Get Backup server to be able to correctly backup Caesar server (2/12 - 3/5) - Don
  • Schedule Complete backup of Caesar (3/11 - 3/15) - Don
  • Schedule Routine backups (3/15 - 4/2) - Don

Server Maintenance

  • Manually set time in affected servers (2/12 - 2/19) - Group
  • Fix Bios issues in Miraculix, AutoMatrix and Obelix (2/12 - 02/26) - Naina
  • Procure and intall new Perc card Battery in Miraculix and Obelix (2/12 - 3/12) - Scott
  • Setup NTP protocol to sync time across servers (3/05 - 3/12) - Scott
  • Investigate IP addresses for server DRAC cards (3/12 - 3/26) - Group

Normalize Operating Systems

  • Check OS versioning of all servers (2/12 - 2/26) - Don
  • Investigate any issues related to OS (2/26 - 3/19) - Wesley
  • Run a train and decode on all drones (2/26 - 4/2) - Group

Improve System Resources

  • Propose purchase and procurement of memory for new drone (02/12/2019 - 03/12/2019) - Don
  • Installation of memory in drone (3/12 - 5/7) - Naina
  • Run Setup of OS on new drones (3/12 - 4/2) - Naina

Server Configuration

  • Research potential child parent rights issue (2/19 - 2/26) - Wesley
  • Report back on finding (2/26 - 3/5) - Wesley
  • Present possible solutions and implement if possible (3/5 - 3/19) - Wesley

Documentation

  • Add and Improve documentation of various processes (2/12 - 4/23) - Group
  • Review documentation and ensure everything is accurately described (4/23 - 5/7) - Wesley