What servers/services do we need to bootstrap ourselves

nazunalika · December 10, 2020, 10:55am

Yes, noggin is on my radar. I haven’t brought this to leadership and infra yet, but I will once we setup the IPA infra in the coming days. I’ll treat this as a reminder.

ezajko · December 10, 2020, 11:01am

Themes can easily be added to noggin [1]

[1] https://github.com/fedora-infra/noggin/tree/dev/noggin/themes

morsik · December 10, 2020, 11:33am

Grafana doesn’t support Alerting at all when you use variables/templating features in it

This is long-standing issue on GitHub, and looks like it’ll be never be solved.

robb · December 10, 2020, 11:38am

Where I come from, the user world is run on windows. Mostly SME’s and schools that run their servers on premises and/or using VPS’s for web applications.
In this world AD is mostly used and I would propose to at least have the option, besides FreeIPA as accountprovider, to also have Samba4 AD account provider available.
Yes I know RH never wanted to support this, but IMO you can not put your head in the sand and pretend it doesn’t exist.

alebeta · December 10, 2020, 11:48am

I support the idea about Gitlab, it has the self hosted option and it can be use for Bug Tracking/Repo/CI-CD and more.

Open stack I can agree, there are other platforms like Proxmox PVE and they are really easy to manage.

RobVerduijn · December 10, 2020, 12:23pm

Hi all,

here’s another few cents from me.

openstack, it can do anyhthing you can come up with, however seriously complicated to setup and maintain, in place upgrades can be done and will work when using triple O.
The expertise level required is very high and this is probably the main reason why not to use it.

ovirt …fantastic tool, very simple to set up, and templates makes scaling your vm’s super simple.
however, it’s sitting next to gluster while waiting to be put to pasture , rh is betting all on openshift and openstack so they are pulling their funding from gluster and ovirt.

for setting up a pipeline openshift is your friend.
openshift is very expensive … however okd the upstream isn’t
It takes a bit of skill to set up, but makes things really easy to develop on when you manage to get it to run.
whatever your devops tool to your liking is…openshift/OKD can work with it
makes scaling/upgrading your build environment really simple.
everything is containers in openshift, a great framework that runs on kubernetes.

Rob

Sherif · December 10, 2020, 1:03pm

While I agree with OpenStack as a fantastic tool, I don’t think oVirt is cut to it… oVirt uses their own HA and they actually refused to use the established corosync HA just because they want to use their own stuff… as they are a part of Redhat, they do have some internal issues they need to work it out, oVirt works out of the box, using their HA and storage, but starting to add more external stuff and it breaks, also the upgrade wasn’t very smooth either it depends on what you need it for

boothb · December 10, 2020, 1:54pm

This would probably be the most future proof environment, seeing that RH is moving everything in this direction. AWX, Ceph… Everything is or will be containerized. And isnt Fedoras infrastructure now also to a large degree on OpenShift?

Sherif · December 10, 2020, 1:54pm

Also what mailing list are we going to use? mailman?

ralloway · December 10, 2020, 2:18pm

Bug tracking that has integration with whatever ticketing system we use is going to be important. There are projects (postgresql comes to mind) where there is no public bug tracker and the public github repo is a mirror of the real, private repo.

Being able to automagically reference issues, commits, etc in a bidirectional manner (“so and so referenced this issue”, “this PR resolves such and such”, etc) is incredibly useful.

neil · December 10, 2020, 4:06pm

technically discourse can act as a listserv…

Atoms · December 10, 2020, 4:54pm

for monitoring prometheus + thanos + alertmanager + grafana is the way to go.
with thanos you can have multiple prometheus servers in different regions or for different type of metric collections, and you can use one data source for grafana when you have thanos.

Netbox for infrastructure accounting but need to decide what will be source of truth. (in one company we did like, when you install new host with ansible at end it executes netbox play and adds host with needed things in Netbox, so in that case our ansible was source of truth, also when we did decomission of host we did it using of playbook, which removed host from netbox)

About virtualization, using some orchestrator of course is very good, but in my opinion it’s useless and just uses some resources which we could use in different way. We used plain kvm hosts, and we don’t need HA, as long as servers can come and go easily with ansible. (of course if ansible is used as source of truth and it is up2date) no manual work on servers should be done except when writing changes for ansible and you want to test those.

dpavlos · December 10, 2020, 8:07pm

Actually ansible can be used as source of truth for a lot of tools @Atoms. We do use Prometheus as a Grafana backend and Icinga for good old health checking & alerting on prem. All deployable via Ansible through Gitlab’s CI/CD pipelines running on Docker. Ansible’s inventory is being also used as source of truth for automatic creation of group/host Icinga definitions so you don’t have to do it by hand one by one. This allows us to exploit Ansible’s facts gathering feature in conjunction with Icinga’s DSL in order to fine tune service check definitions (eg. apply check_omsa on all Dell branded servers).

buntspecht · December 10, 2020, 8:40pm

THIS is the thing we have to work on first! We need a working build system + Ticket-System + lots of people testing things. Nothing else!

Rest will come after time.

MiCruz · December 10, 2020, 8:47pm

Correct but this is my opinion, I know folks are excited to help and build… I’m not sure if the project already has lots of $ to spend on all of this.

I would like to see a date for when the first build should be done by, even if we call it Alpha-1, then we can iterate from there and add supporting services etc.

Also, I know there has to be some organizational things that need to be ironed out real early on this effort, before we can buld but w/o a date, it will be hard to zero-in to the hard-requirements.

idprism · December 10, 2020, 9:00pm

I love infrastructure and logistics, but I have to second this sentiment.

Getting a codified (no human steps) koji install and service up on rockylinux.org that pulls RH SRPMS and builds seems like it should be the only priority for now.

Would be kind of awesome if it could be folding@home style – donate compute cycles to the rockylinux build system…

buntspecht · December 10, 2020, 9:06pm

“Alpha 1” needs to be prio 1 (without any branding whatever). When we have a working version then people willl consider jumping the sinking CentOS-ship and donate $$.

Then we can start building infrastructure.

cropDustr · December 10, 2020, 9:13pm

I see a lot of either or of Prometheus+Grafana OR Zabbix. Grafana can use the Zabbix API as a data source which allows normal Zabbix server functionality for triggers/actions/scalability etc. while still using Grafana in place of Zabbix’s slightly more narrow dashboard capabilities.

mekster · December 10, 2020, 9:56pm

While Gitea works well (which I use daily), with many people involved, perhaps use something with gantt chart, so people can track the schedule of each tasks visually?

I have not used this but there seems to be one for GitLab (and GitHub) called GanttLab.

For alerting, Prometheus can do that but I’ve been using monit for 10+ years and it’s working great with easy to understand DSL with minimum fuss. (
Sample configurations.)

If you want to have a unified web interface of monit deployed servers, m/monit is a nice addition. (This is a self hosted software but it’s a one time payment software.)

(I’m not affiliated with any of the software mentioned.)

For issue tracking, mantis is pretty old and probably not the right choice when people are more accustomed to the GitHub style which Gitea and GitLab supports.

wiki.js is feature rich and is a solid solution (You can see sample from their official doc. You can also do tables easily but I’ve seen some layout corruption on mobile phones on some occasions.

I know this is not a popular opinion but WordPress can be a decent wiki platform too and so far it worked well when I used it as a wiki without major problems including on mobiles. Maybe something to consider.

And I would totally avoid restic for backup. While the tool seems popular and command line interface looks slick, it chokes when the data gets big and consumes memory and the issue gets closed without being fixed. Also restore takes very long on large data sets. I’ve looked at plenty of open source backup CLI tools and duplicacy is another decent one but this also consumes high memory at this point and I wouldn’t recommend it and it’s also slightly costly with per server subscription.

On the other hand, borg is quite stable and performant and is the recommended tool for encrypted/deduplicated backup but this tool only allows targets against SSH servers, so no easy way to target S3 (and compatibles) but there are services like rsync.net (with borg discount) with a track record of 20 years which provides just that. There’s also borgbase but is a young service.

(Again, not affiliated with any of the services mentioned.)

Also what to do with file sharing?

NextCloud is ok but definitely not as polished as other online storage drive services and I wouldn’t touch most of the plugins as they’re not professionally complete.

If there is a need for shared calendars using calDAV, I actually recommend NextCloud as it’s easy to set up that can link up with keycloak than standalone servers like Radicale which works well for macOS, iOS and Android. (Not sure of a Windows calDAV client but AgenDAV is a modern looking decent web interface.)

For a smaller stuff, “healthchecks” can be self hosted to watch for cron jobs success to keep a peace of mind than having cron jobs pile up error logs without people noticing.

(Was going to add links but couldn’t due to Discourse limitation on new users.)

mekster · December 10, 2020, 9:57pm

As for monitoring, I recommend Prometheus and Grafana stack with VictoriaMetrics as long term storage of Prometheus (as it’s not supposed to be used for that role.)

VictoriaMetrics is open source and talks PromQL directly and is said to be performant and uses less disk space than InfluxDB or TimescaleDB (at the time the blog was posted 2 years ago).

Topic		Replies	Views
Monitoring Infrastructure Infrastructure	26	4141	February 2, 2021
Configuration Management Options Infrastructure	28	4748	August 25, 2023
Rocky / Openstack ( RDO or centos-release-openstack ) Rocky Linux Help & Support rocky-linux-9	4	237	February 5, 2025
Free and open source alternatives to Icinga (monitoring solution)? Rocky Linux Help & Support	2	128	April 11, 2025
OpenStack/RDOProject Rocky Linux Help & Support	4	3462	August 25, 2023

What servers/services do we need to bootstrap ourselves

Related topics