responses

A. Crime will always happen everywhere, but we need to look at the trends of the past to guess what the future will hold so that things can be more expected and can prepare to combat it. I plan on creating a neural network to make prediction on criminal behavior based on the history of the arrest bookings. I want to use machine learning to help classify criminal behavior. I also want to explore the idea of using a regression machine learning model to predict levels of crimes by factors like months, neighborhoods, and socioeconomic indicators. I do think that predictions can be unfair towards minorities based off of trends due to correlations, so maybe implementing loss functions can help combat that problem. I think this is an important issue to discuss because this can probably help keep crime low, especially in area where crime is a serious problem.

  1. The optimizer selected was the RMSprop. Rprop and Adagrad are other optimizers. RMSprop is the adaptation of rprop algorithm for mini-batch learning. RMSprop also has similarities with Adagrad. It is viewed as a way to deal with its radically diminshing learning rates. Rprop doesnt really work with large datasets. RMSprop is similar to Adagrad as it still keep the same estimate of squared gradients, but instead of letting that estimate continually accumulate over training, we keep a moving average of it.

  2. The loss function chosen was the binary_crossentropy. We used this loss function because we are training a binary classifier (cat or dog). If the probability associated with the true class is 1.0, we need its loss to be zero. Conversely, if that probability is low, we need its loss to be huge. Taking the negative log of the probability will help penalize bad predictions.

  3. the metric=argument sets up parameters to judge how well the model is working. It is similar to loss functions, but the results from evaluating a metric aren’t used when training the model.

figure 1

figure 2

The model is definitely overfit. The val_loss was decreasing from the 12th-14th epoch and then on the last epoch it increased. This is a clear indication that the model is overfit. The accuaracy of the training set was .97 so it learned the training set too well.

chow chow predicted correct

husky predicted correct

pitbull predicted correct

tabby predicted correct

siamese predicted correct

persian predicted correct

The model did accurately classify all my pictures. Since the model is overfit, I suggest that we reduce the network’s capacity by removing layers, apply regularization by adding a cost to the loss function for large weights, and use dropout layers.