Skip to content

Add /mon/nsfrb/timing and /mon/nsfrb/packets to influx/grafana #320

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mbsherma opened this issue May 27, 2025 · 6 comments
Open

Add /mon/nsfrb/timing and /mon/nsfrb/packets to influx/grafana #320

mbsherma opened this issue May 27, 2025 · 6 comments
Assignees

Comments

@mbsherma
Copy link

I'd like to add two keys to Influx/Grafana for the nsfrb search pipeline. If possible, it'd be most convenient to have these on a new grafana dashboard, but if that's not possible I understand. The keys will be formatted as below:

  • /mon/nsfrb/timing: {0: {
    image_time: float,
    tx_time: float,
    ISOT: str},
    ...
    15: {
    image_time: float,
    tx_time: float,
    ISOT: str},
    search_time: float}

  • /mon/nsfrb/packets: {dropped: int}

For /mon/nsfrb/timing, there's a separate key for each correlator node (0 to 15) which points to a dictionary with the time required for imaging and transmitting fast visibility data and the timestamp. The other key is the time for searching, which will just be one value for the full bandwidth.

For /mon/nsfrb/packets, the only key is `dropped' which reports the number of dropped packets as an integer between 0 - 16.

Let me know if more info is needed; the realtime system isn't deployed yet, but I'm working on benchmarking tests, so it'd be helpful to view these on grafana. Thanks!

@mbsherma mbsherma changed the title Add /mon/nsfrb/timing and /mon/nsfrb/packets to etcd Add /mon/nsfrb/timing and /mon/nsfrb/packets to influx/grafana May 27, 2025
@rh-codebase
Copy link
Collaborator

Pretty sure that data structure won't work with the current implementation. Each node will have to publish it's monitor data under an etcd key. In this case, like so:

/mon/nsfrbtiming/1
/mon/nsfrbtiming/2
....
/mon/nsfrbtiming/16

'0' is special as it represents all nodes(for commands so the index starts at 1). Similar for:

/mon/nsfrbpackets/1
....
/on/nsfrbpackets/16

Each of the data structures should include the name of the node. For antennas, this was 'ant_num'. For correlator nodes, you could use: 'corr_num' as was done for /mon/corr

As for dashboard creation you are likely on your own assuming everyone can create a dashboard.

@mbsherma
Copy link
Author

@rh-codebase Sorry for the delay, thanks for the clarification; I've revised the timing keys to match the format you sent:

  • /mon/nsfrbtiming/1...16: {
    corr_num:0...15
    image_time:float
    tx_time: float,
    ISOT: str
    }

I made a separate key for the search time since that will only have one value, not a value per corr node:

  • /mon/nsfrbsearchtiming: {
    search_time:float
    }
    Similarly, there's only one packets key for now ("dropped" is an integer between 0-16 because the process server is reporting how many corr nodes it failed to receive data from, the corr nodes don't need individual keys though):
  • /mon/nsfrbpackets: {
    dropped: int
    }
    I'll work on creating a dashboard, thanks!

@rh-codebase
Copy link
Collaborator

The following influx tables should be created once data to the above keys are seen in etcd:

/mon/nsfrbtiming => nsfrbtiming (tag: corr_num)
/mon/nsfrbsearchtiming => nsfrbsearchtiming
/mon/nsfrbpackets => nsfrbpackets

In grafana, you want to construct queries against these tables(ie. select tx_time from nsfrbtiming group by corr_num) Something like that. Give a shout if needed.

@mbsherma
Copy link
Author

Thanks for your help! I'm having some trouble with the query, e.g. for the nsfrbsearchtiming query I currently have this:

SELECT last("search_time") FROM "nsfrbsearchtiming" WHERE $timeFilter GROUP BY time($__interval) fill(null)

I confirmed that the field is populated in etcd:

>>> import dsautils.dsa_store as ds
>>> ETCD = ds.DsaStore()
>>> ETCD.get_dict("/mon/nsfrbsearchtiming")
{'search_time': 1.7225334644317627}

But I also see there's no corresponding field in influx

>>> from influxdb import DataFrameClient
>>> influx = DataFrameClient('influxdbservice.pro.pvt', 8086, 'root', 'root', 'dsa110')
>>> influx.get_list_measurements()
[{'name': 'antmon'}, {'name': 'bebmon'}, {'name': 'calmon'}, {'name': 'corrmon'}, {'name': 'statusmon'}, {'name': 't2mon'}, {'name': 'wxmon'}]

Is there something I need to do after populating the etcd key with ETCD.put_dict to make it show up in influx?

@mbsherma
Copy link
Author

@rh-codebase Just following up on this, still no luck querying from grafana. I did some digging and maybe I'm not using the correct configuration file for ETCD? I believe the default file is this: etcdConfig.yml, is that still correct?

@mbsherma
Copy link
Author

@rh-codebase Here's a few other examples; @caseyjlaw mentioned I should still have a number at the end of the nsfrbsearchtiming and nsfrbpackets keys, so I tried /mon/nsfrbsearchtiming/1 but got similar results:

>>> ETCD.put_dict("/mon/nsfrbsearchtiming/1",{'search_time': 1.8956832885742188,'search_num':1})
>>> from influxdb import DataFrameClient
>>> influx = DataFrameClient('influxdbservice.pro.pvt', 8086, 'root', 'root', 'dsa110')
>>> influx.get_list_measurements()
[{'name': 'antmon'}, {'name': 'bebmon'}, {'name': 'calmon'}, {'name': 'corrmon'}, {'name': 'statusmon'}, {'name': 't2mon'}, {'name': 'wxmon'}]
>>> x=influx.query("SELECT search_time FROM nsfrbsearchtiming")
>>> x
{}

And does the format and query for the nsfrbtiming key look correct? Here's the python query:

>>> ETCD.get_dict("/mon/nsfrbtiming/1")
{'corr_num': 0, 'ISOT': '2025-05-30T21:08:38.684', 'image_time': 0.5105586051940918, 'tx_time': 1.3844079971313477}

and the influx query in grafana:

SELECT last("image_time") AS "alias" FROM "nsfrbtiming" WHERE $timeFilter GROUP BY time($__interval) fill(null)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants