Setting Up New Project Directories

This note demonstrates a few practical steps for setting up a workspace for a new data science project.

Specifically, it includes the bash commands necessary to:

  • Create a local project directory,
  • Set up a new environment with Conda,
  • Save that Conda environment to an environment file within the project directory,
  • Initialize the directory as a Git repository and link to GitHub, and
  • Set up a simple bash command to quickly and cleanly enter the new project workspace.

1. Create a local project directory

I use a folder called repos located in at ~ to hold projects I want to push to GitHub. The project I am setting up is called finding-donors.

cd ~/repos
mkdir finding-donors
cd finding-donors

2. Create a new environment with Conda

conda create --name finding-donors python=3.7

3. Activate that new environment and (optionally) install packages

conda activate finding-donors
conda install jupyter pandas scikit-learn

4. Export the environment to environment.yml

conda env export --no-builds > environment.yml

After setting up a new repo on Github, run the following from within ~/repos/finding-donors. Replace both the username (RyanTWingate) and repo name (finding-donors) with your own.

git init
git add .
git commit -m "first commit"
git remote add origin https://github.com/RyanTWingate/finding-donors.git
git push -u origin master

6. Set up a simple Bash alias to quickly entire the new Workspace.

Add the following to your .bashrc file, usually located at ~. Replace the alias name, directory name, and environment name as appropriate.

alias fd='cd ~/repos/finding-donors && conda activate finding-donors'

Other Useful Setup and Commands

To list the available conda environments:

conda env list

To deactivate the environment later, when necessary:

conda deactivate

To remove the environment later, if necessary:

conda remove --name finding-donors --all

For a quick way to exit the workspace, consider adding the following to your .bashrc file, usually located at ~:

alias q='cd && conda deactivate'

To create a conda environment from an environment.yml file in the current directory:

conda env create --file environment.yml