Add /mon/nsfrb/timing and /mon/nsfrb/packets to influx/grafana #320

mbsherma · 2025-05-27T14:30:48Z

I'd like to add two keys to Influx/Grafana for the nsfrb search pipeline. If possible, it'd be most convenient to have these on a new grafana dashboard, but if that's not possible I understand. The keys will be formatted as below:

/mon/nsfrb/timing: {0: {
image_time: float,
tx_time: float,
ISOT: str},
...
15: {
image_time: float,
tx_time: float,
ISOT: str},
search_time: float}
/mon/nsfrb/packets: {dropped: int}

For /mon/nsfrb/timing, there's a separate key for each correlator node (0 to 15) which points to a dictionary with the time required for imaging and transmitting fast visibility data and the timestamp. The other key is the time for searching, which will just be one value for the full bandwidth.

For /mon/nsfrb/packets, the only key is `dropped' which reports the number of dropped packets as an integer between 0 - 16.

Let me know if more info is needed; the realtime system isn't deployed yet, but I'm working on benchmarking tests, so it'd be helpful to view these on grafana. Thanks!

rh-codebase · 2025-05-27T21:18:45Z

Pretty sure that data structure won't work with the current implementation. Each node will have to publish it's monitor data under an etcd key. In this case, like so:

/mon/nsfrbtiming/1
/mon/nsfrbtiming/2
....
/mon/nsfrbtiming/16

'0' is special as it represents all nodes(for commands so the index starts at 1). Similar for:

/mon/nsfrbpackets/1
....
/on/nsfrbpackets/16

Each of the data structures should include the name of the node. For antennas, this was 'ant_num'. For correlator nodes, you could use: 'corr_num' as was done for /mon/corr

As for dashboard creation you are likely on your own assuming everyone can create a dashboard.

mbsherma · 2025-05-29T14:21:48Z

@rh-codebase Sorry for the delay, thanks for the clarification; I've revised the timing keys to match the format you sent:

/mon/nsfrbtiming/1...16: {
corr_num:0...15
image_time:float
tx_time: float,
ISOT: str
}

I made a separate key for the search time since that will only have one value, not a value per corr node:

/mon/nsfrbsearchtiming: {
search_time:float
}
Similarly, there's only one packets key for now ("dropped" is an integer between 0-16 because the process server is reporting how many corr nodes it failed to receive data from, the corr nodes don't need individual keys though):
/mon/nsfrbpackets: {
dropped: int
}
I'll work on creating a dashboard, thanks!

rh-codebase · 2025-05-29T23:59:12Z

The following influx tables should be created once data to the above keys are seen in etcd:

/mon/nsfrbtiming => nsfrbtiming (tag: corr_num)
/mon/nsfrbsearchtiming => nsfrbsearchtiming
/mon/nsfrbpackets => nsfrbpackets

In grafana, you want to construct queries against these tables(ie. select tx_time from nsfrbtiming group by corr_num) Something like that. Give a shout if needed.

mbsherma · 2025-05-30T02:33:51Z

Thanks for your help! I'm having some trouble with the query, e.g. for the nsfrbsearchtiming query I currently have this:

SELECT last("search_time") FROM "nsfrbsearchtiming" WHERE $timeFilter GROUP BY time($__interval) fill(null)

I confirmed that the field is populated in etcd:

>>> import dsautils.dsa_store as ds
>>> ETCD = ds.DsaStore()
>>> ETCD.get_dict("/mon/nsfrbsearchtiming")
{'search_time': 1.7225334644317627}

But I also see there's no corresponding field in influx

>>> from influxdb import DataFrameClient
>>> influx = DataFrameClient('influxdbservice.pro.pvt', 8086, 'root', 'root', 'dsa110')
>>> influx.get_list_measurements()
[{'name': 'antmon'}, {'name': 'bebmon'}, {'name': 'calmon'}, {'name': 'corrmon'}, {'name': 'statusmon'}, {'name': 't2mon'}, {'name': 'wxmon'}]

Is there something I need to do after populating the etcd key with ETCD.put_dict to make it show up in influx?

mbsherma · 2025-05-30T19:18:58Z

@rh-codebase Just following up on this, still no luck querying from grafana. I did some digging and maybe I'm not using the correct configuration file for ETCD? I believe the default file is this: etcdConfig.yml, is that still correct?

mbsherma · 2025-05-30T21:59:35Z

@rh-codebase Here's a few other examples; @caseyjlaw mentioned I should still have a number at the end of the nsfrbsearchtiming and nsfrbpackets keys, so I tried /mon/nsfrbsearchtiming/1 but got similar results:

>>> ETCD.put_dict("/mon/nsfrbsearchtiming/1",{'search_time': 1.8956832885742188,'search_num':1})
>>> from influxdb import DataFrameClient
>>> influx = DataFrameClient('influxdbservice.pro.pvt', 8086, 'root', 'root', 'dsa110')
>>> influx.get_list_measurements()
[{'name': 'antmon'}, {'name': 'bebmon'}, {'name': 'calmon'}, {'name': 'corrmon'}, {'name': 'statusmon'}, {'name': 't2mon'}, {'name': 'wxmon'}]
>>> x=influx.query("SELECT search_time FROM nsfrbsearchtiming")
>>> x
{}

And does the format and query for the nsfrbtiming key look correct? Here's the python query:

>>> ETCD.get_dict("/mon/nsfrbtiming/1")
{'corr_num': 0, 'ISOT': '2025-05-30T21:08:38.684', 'image_time': 0.5105586051940918, 'tx_time': 1.3844079971313477}

and the influx query in grafana:

SELECT last("image_time") AS "alias" FROM "nsfrbtiming" WHERE $timeFilter GROUP BY time($__interval) fill(null)

mbsherma added the Completion: nsfrb label May 27, 2025

mbsherma changed the title ~~Add /mon/nsfrb/timing and /mon/nsfrb/packets to etcd~~ Add /mon/nsfrb/timing and /mon/nsfrb/packets to influx/grafana May 27, 2025

mbsherma assigned mbsherma and rh-codebase May 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add /mon/nsfrb/timing and /mon/nsfrb/packets to influx/grafana #320

Add /mon/nsfrb/timing and /mon/nsfrb/packets to influx/grafana #320

mbsherma commented May 27, 2025

rh-codebase commented May 27, 2025

Uh oh!

mbsherma commented May 29, 2025

Uh oh!

rh-codebase commented May 29, 2025

Uh oh!

mbsherma commented May 30, 2025

Uh oh!

mbsherma commented May 30, 2025

Uh oh!

mbsherma commented May 30, 2025

Uh oh!

Add /mon/nsfrb/timing and /mon/nsfrb/packets to influx/grafana #320

Add /mon/nsfrb/timing and /mon/nsfrb/packets to influx/grafana #320

Comments

mbsherma commented May 27, 2025

rh-codebase commented May 27, 2025

Uh oh!

mbsherma commented May 29, 2025

Uh oh!

rh-codebase commented May 29, 2025

Uh oh!

mbsherma commented May 30, 2025

Uh oh!

mbsherma commented May 30, 2025

Uh oh!

mbsherma commented May 30, 2025

Uh oh!