Need help to fix ERR 500

MikeiLL · October 19, 2015, 3:26pm

This looks like it will be very useful information. Thank you.

MikeiLL · November 7, 2015, 9:02am

@maikel and @gnollet I am at the same issue you solved a few weeks ago, trying to get monit to run the /script/delayed_job process. For some reason, possibly unwisely, I changed the app name to ofn_america throughout the deploy script, but I don’t imagine that should make much difference.

The script generated by ansible via the /roles/common/templates/monit.j2 template looks like this:

check process ofn_america_dj_worker_0
with pidfile /home/ubuntu/apps/ofn_america/current/tmp/pids/delayed_job.0.pid
start program = "/bin/bash -c 'RAILS_ENV=staging /home/ubuntu/.rbenv/shims/ruby /home/ubuntu/apps/ofn_america/current/script/delayed_job -i 0 start'"
as uid ubuntu and gid ubuntu
with timeout 120 seconds
stop program = "/bin/bash -c 'RAILS_ENV=staging /home/ubuntu/.rbenv/shims/ruby /home/ubuntu/apps/ofn_america/current/script/delayed_job -i 0 stop'"
as uid ubuntu and gid ubuntu
with timeout 120 seconds
if mem is greater than 250.0 MB for 3 cycles then restart

And the process will start (sending emails) if I run the CLI command directly:

/bin/bash -c 'RAILS_ENV=staging /home/ubuntu/.rbenv/shims/ruby /home/ubuntu/apps/ofn_america/current/script/delayed_job -i 0 start'

And /home/ubuntu/apps/myapp/current/tmp/pids/delayed_job.0.pid exists (until I stop the process).

-rwxr-xr-x 1 ubuntu ubuntu  175 Nov  6 01:35 delayed_job

I notice you reference a file called delayed_job.sh above which in my server contains:

#!/usr/bin/env bash

export HOME="/home/ubuntu"
export PATH="$HOME/.rbenv/bin:$HOME/.rbenv/shims:$PATH"

$HOME/apps/ofn_america/current/script/delayed_job $@

I can also, from within the script dir start the process with CLI:

sudo bash delayed_job.sh -i 0 start

And again, tmp/pids/delayed_job.0.pid exists and emails are sent.

I restart monit sudo service monit restart: * Restarting daemon monitor monit.

Waited a few minutes and even tried raising timeout to 240, but it’s not nearly that slow when starting delayed jobs fro the CLI.

Any thoughts?

Posted here as well.

maikel · November 8, 2015, 12:03am

Hi!

Monit is a bit difficult to debug. Look into /var/log/monit.log to see if monit is trying to start delayed job. It will only show bash -c as command to start. But probably it will tell you that starting failed every two minutes. Unfortunately, it won’t give you any output of the failing command.

I found it very useful to follow this post about setting the environment as Monit does.

sudo su -u ubuntu
env -i PATH=/bin:/usr/bin:/sbin:/usr/sbin /bin/sh
/bin/bash -c 'RAILS_ENV=staging /home/ubuntu/.rbenv/shims/ruby /home/ubuntu/apps/ofn_america/current/script/delayed_job -i 0 start'

That should tell you what goes wrong.

We used delayed_job.sh in the past. It was called by Monit. But the only thing it does, is setting the PATH environment variable. So we figured that we can simplify the call stack. Maybe we missed something. So I’m looking forward to your findings.

MikeiLL · November 9, 2015, 7:16am

I don’t know, my friend. It’s working now. With the bash script OR going strait to the ruby script. I wish I knew what was different. I had also found my way to that same (great) SO post recommending to use Bash with a $PATH variable set.

Thank you for the input, though.

Do you know offhand how Unicorn is supposed to be run? Is monit configured to manage that as well? I had to manually restart it via sudo service unicorn_ofn_america restart.

Also I’d love to get an idea of your server resources. I’m right now with:

 2 Core 
2048MB 
RAM40GB 
Disk2000GB Bandwidth

But have already been pushing the limits with very little usage of the app.

maikel · November 11, 2015, 12:49am

Good that it’s working now.

Unicorn is normally just running. We use the Git post-receive hook to deploy new versions. That script is stopping and starting unicorn (sudo service unicorn_openfoodnetwork stop). A restart works most times but doesn’t pick up newly installed gems.

We upgraded our server a couple of time because of memory issues. Currently, we have 4 cores with 8GB memory. We are still running only two workers. But we hope that we can reduce the memory consumption and run 4 workers on that server. I have no idea about the bandwidth. It’s an AWS server.

MikeiLL · November 11, 2015, 2:43am

Very cool. I hadn’t heard of post-receive before. It looks relatively straight-forward, but if yours is in a place where I could reference it, I wouldn’t mind a look. Will also experiment with using the sudo service unicorn_ofn_america instead of just sudo service unicorn and see how that works.

Your input is much appreciated as the US development team is a little lonely at the moment.

maikel · November 12, 2015, 8:23am

I documented the deployment via the post-receive hook recently: https://github.com/openfoodfoundation/ofn_deployment/wiki/Deployment-with-Git

But it assumes that you provisioned with the latest ofn_deployment code. We updated the post-receive template in there recently. You could put it on the server manually, but you need to replace all the variables in the template then: https://github.com/openfoodfoundation/ofn_deployment/blob/master/roles/app/templates/post-receive.j2

MikeiLL · November 12, 2015, 3:02pm

Very cool, man. I just pulled in the latest changes to the deployment sript yesterday, as well as updating ruby: 2.1.5p273 # was 1.9.3-p392. I think that still needs to be updated in ofn_deployment example script, although I’m not sure if we want to specify p273 or not as I had already installed it manually with rbenv install 2.1.5 (unspecified).

maikel · November 12, 2015, 10:46pm

Rohan wrote a playbook for updating Ruby on the server as well. Our Gemfile just specifies 2.1.5. There shouldn’t be a reason to specify the patch level.

gnollet · November 17, 2015, 9:54pm

Hi,

I’m trying to upgrade to last version by using ofn_deployment. Then I found some issues.
The seeds files are not setup to use I10n package :
On the main.yml under roles/deploy/task, I changed the config as below :
"#"TODO: Ugly hack until we have better configuration management

name: symlink into the repo
file: src={{ item.src }} dest={{ item.dest }} state=link force=yes owner={{ unicorn_user }}
with_items:
- { src: “{{ assets_path }}”, dest: “{{ build_path }}/public/assets” }
- { src: “{{ system_path }}”, dest: “{{ build_path }}/public/system” }
- { src: “{{ spree_path }}”, dest: “{{ build_path }}/public/spree” }
- { src: “{{ config_path }}/database.yml”, dest: “{{ build_path }}/config/database.yml” }
- { src: “{{ config_path }}/application.yml”, dest: “{{ build_path }}/config/application.yml” }
  "#" - { src: “{{ config_path }}/seeds.rb”, dest: “{{ build_path }}/db/seeds.rb” } # I comment this line
- { src: “{{ l10n_path }}/seeds.rb”, dest: “{{ build_path }}/db/seeds.rb” }
- { src: “{{ l10n_path }}/suburb_seeds.rb”, dest: “{{ build_path }}/db/suburb_seeds.rb” }
- { src: “{{ l10n_path }}/suburbs.csv”, dest: “{{ build_path }}/db/suburbs.csv” }
- { src: “{{ l10n_path }}/states.yml”, dest: “{{ build_path }}/db/default/spree/states.yml” }
- { src: “{{ l10n_path }}/countries.yml”, dest: “{{ build_path }}/db/default/spree/countries.yml” }
  tags: symlink

After fixing seeds, I got this error at the end of the deploment :
NOTIFIED: [mortik.nginx-rails | restart nginx] ********************************
changed: [127.0.0.1]

NOTIFIED: [webserver | restart unicorn] ***************************************
changed: [127.0.0.1]

NOTIFIED: [webserver | restart unicorn step 2] ********************************
failed: [127.0.0.1] => {“failed”: true}
msg: unicorn_openfoodnetwork: unrecognized service
unicorn_openfoodnetwork: unrecognized service

FATAL: all hosts have already failed – aborting

NOTIFIED: [webserver | restart unicorn step 2] ********************************
FATAL: no hosts matched or all hosts have already failed – aborting

FATAL: all hosts have already failed – aborting

NOTIFIED: [webserver | restart unicorn step 2] ********************************
FATAL: no hosts matched or all hosts have already failed – aborting

FATAL: all hosts have already failed – aborting

PLAY RECAP ********************************************************************

The failing tasks are in the handler roles/webserver/handlers/main.yml but I don’t know how to fix it.

On the production.log (but maybe the deploy process is not completed and it’s not important for te moment ?) :
Completed 500 Internal Server Error in 57.0ms
** [Bugsnag] No API key configured, couldn’t notify

ActionView::Template::Error (darkswarm/all.css isn’t precompiled):
12:
13: = yield :scripts
14: %script{src: “//maps.googleapis.com/maps/api/js?libraries=places,geometry&sensor=false”}
15: = split_stylesheet_link_tag "darkswarm/all"
16: = javascript_include_tag "darkswarm/all"
17:
18:
app/views/layouts/darkswarm.html.haml:15:in `_94a4bac7f0ff8866b37431d489d93af7’

I you have an idea, you are welcome.
Thanks

MikeiLL · November 18, 2015, 2:19am

What happens if you go to the terminal and run:

sudo /etc/init.d/unicorn_openfoodnetwork restart

?

gnollet · November 18, 2015, 6:58am

I should enter the password for the user.
On production.log file, I get the message I sent on previous message

MikeiLL · November 18, 2015, 2:01pm

Apologies if I’m misunderstanding the issue or if I’m misremembering how it works, but does the unicorn_openfoodnetwork service exist in /etc/init.d/ directory?

gnollet · November 18, 2015, 10:19pm

Yes, the service exist.

I think I found the issue, on vars.yml, the is 2 variables for user :

user: openfoodnetwork

"# User name for the unprivileged user which runs unicorn
unicorn_user: openfoodnetwork

user is used by playbook user.yml
unicorn_user is used by deploy.yml

I defined the same user for both variables and now, the step on the playbook is passed !

Now, the playbook is failing on seeds :
TASK: [deploy | seed database] ************************************************
failed: [127.0.0.1] => {“changed”: true, “cmd”: [“bash”, “-lc”, “/home/openfoodnetwork/apps/openfoodnetwork/shared/config/seed.sh RAILS_ENV=production”], “delta”: “0:00:27.616035”, “end”: “2015-11-18 21:45:05.331926”, “rc”: 1, “start”: “2015-11-18 21:44:37.715891”, “warnings”: []}
stderr: Digest::Digest is deprecated; use Digest
rake aborted!
Called id for nil, which would mistakenly be 8 – if you really wanted the id of nil, use object_id
/home/openfoodnetwork/apps/openfoodnetwork/shared/l10n/suburb_seeds.rb:3:in seed_suburbs' /home/openfoodnetwork/apps/openfoodnetwork/current/db/seeds.rb:127:in<top (required)>’
/home/openfoodnetwork/.gem/ruby/2.1.0/gems/activesupport-3.2.21/lib/active_support/dependencies.rb:245:in load' /home/openfoodnetwork/.gem/ruby/2.1.0/gems/activesupport-3.2.21/lib/active_support/dependencies.rb:245:inblock in load’
/home/openfoodnetwork/.gem/ruby/2.1.0/gems/activesupport-3.2.21/lib/active_support/dependencies.rb:236:in load_dependency' /home/openfoodnetwork/.gem/ruby/2.1.0/gems/activesupport-3.2.21/lib/active_support/dependencies.rb:245:inload’
/home/openfoodnetwork/.gem/ruby/2.1.0/gems/railties-3.2.21/lib/rails/engine.rb:525:in load_seed' /home/openfoodnetwork/.gem/ruby/2.1.0/gems/activerecord-3.2.21/lib/active_record/railties/databases.rake:347:inblock (2 levels) in <top (required)>'
Tasks: TOP => db:seed
(See full trace by running task with --trace)

FATAL: all hosts have already failed – aborting

I uncomment a line on this part :

name: seed database

We run a shell script that passes the default email and password to rake with an EOF block, so we don’t hang on the prompts.

command: bash -lc “{{ config_path }}/seed.sh RAILS_ENV={{ rails_env }}” chdir="{{ current_path }}"
–> when: table_exists.stderr.find(‘does not exist’) != -1
tags: seed
notify:
- precompile assets
- restart unicorn
I activate the “when” condition and now it’s ok. The website is running.

MikeiLL · November 19, 2015, 4:04am

Great news. Thank you for sharing. Will revisit this. I remember having some issues with those tasks at one point on the AWS Micro instance and might have ended up replacing with some ansible “raw” commands. Am using a slightly more robust (nonAWS) VPS at the moment.

gnollet · November 19, 2015, 9:43pm

I’m running the website on Azure VM.

The delayed_job.sh is not setup correctly.
I modify paths but I can’t send email.

If I try to launch it manually, I get this error message :
./delayed_job.sh
Digest::Digest is deprecated; use Digest
(eval):1: warning: encountered \r in middle of line, treated as a mere space
ERROR: no command given

Usage: delayed_job –

where is one of:
start start an instance of the application
stop stop all instances of the application
restart stop all instances and restart them afterwards
reload send a SIGHUP to all instances of the application
run start the application and stay on top
zap set the application to a stopped state
status show status (PID) of application instances
and where may contain several of the following:

-t, --ontop Stay on top (does not daemonize)
-f, --force Force operation
-n, --no_wait Do not wait for processes to stop

Common options:
-h, --help Show this message
–version Show version

I don’t know what tho check more.
Thanks

MikeiLL · November 20, 2015, 5:02am

Are you running it with flags:

./delayed_job.sh -i 0 start

or

sudo bash delayed_job.sh -i 0 start

I imagine you get the same (Ruby?) error when you run:

sudo /bin/bash -c 'RAILS_ENV=staging /home/ubuntu/.rbenv/shims/ruby /home/ubuntu/apps/openfoodnetwork/current/script/delayed_job -i 0 start'

possibly with RAILS_ENV=production?

Also you have probably seen already that the canada-updates branch of the ofn_deployment repo has some significant improvements.

Also - likely that the warning about the use of Digest::Digest is in one of the gazillion gems that are used in the app.

gnollet · November 20, 2015, 7:18am

Running the command give me this result :
./delayed_job.sh -i 0 start
Digest::Digest is deprecated; use Digest
(eval):1: warning: encountered \r in middle of line, treated as a mere space
delayed_job.0: process with pid 41919 started.

on delayed_job.log, I can see this message :
2015-11-20T07:13:13+0000: [Worker(delayed_job.0 host:vmaztstfrofn3 pid:41577)] Job Enterprise#send_confirmation_instructions_without_delay (id=4) RUNNING
2015-11-20T07:13:14+0000: [Worker(delayed_job.0 host:vmaztstfrofn3 pid:41577)] Job Enterprise#send_confirmation_instructions_without_delay (id=4) COMPLETED after 0.2093
2015-11-20T07:13:14+0000: [Worker(delayed_job.0 host:vmaztstfrofn3 pid:41577)] 1 jobs processed at 4.0954 j/s, 0 failed
2015-11-20T07:14:53+0000: [Worker(delayed_job.0 host:vmaztstfrofn3 pid:42165)] Job Enterprise#send_confirmation_instructions_without_delay (id=5) RUNNING
2015-11-20T07:14:55+0000: [Worker(delayed_job.0 host:vmaztstfrofn3 pid:42165)] Job Enterprise#send_confirmation_instructions_without_delay (id=5) COMPLETED after 2.4779
2015-11-20T07:14:55+0000: [Worker(delayed_job.0 host:vmaztstfrofn3 pid:42165)] 1 jobs processed at 0.3847 j/s, 0 failed

Then I think delayed_job is running, but I don’t receive confirmation email.

on developpment.log, I see this log :
Delayed::Backend::ActiveRecord::Job Load (2.6ms) UPDATE “delayed_jobs” SET locked_at = ‘2015-11-20 07:16:56.001319’, locked_by = ‘delayed_job.0 host:vmaztstfrofn3 pid:42165’ WHERE id IN (SELECT id FROM “delayed_jobs” WHERE ((run_at <= ‘2015-11-20 07:16:56.000093’ AND (locked_at IS NULL OR locked_at < ‘2015-11-20 07:01:56.000144’) OR locked_by = ‘delayed_job.0 host:vmaztstfrofn3 pid:42165’) AND failed_at IS NULL) ORDER BY priority ASC, run_at ASC LIMIT 1 FOR UPDATE) RETURNING *

gnollet · November 20, 2015, 10:11am

I will try the canada-update branch tonight

Thanks

MikeiLL · November 20, 2015, 1:59pm

I wish my input was more informed.

I have a question that you may be able to offer some insight on. Some of the images in my installation are not showing up, for instance on the main admin page there is a broken link in the top left pointing to an image called “missing.png”. I had poked around the code a little bit (I’m still somewhat vague on the Rails workflow) and it looks like this image gets pointed to when something’s missing. But I’m not so far figuring out where or how this missing image is supposed to be set. Is there a command line task or a Spree configuration that I’m missing here?

Thanks and good luck!

Need help to fix ERR 500

We run a shell script that passes the default email and password to rake with an EOF block, so we don’t hang on the prompts.