Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pose model output 39x5 #4752

Closed
charlieforward9 opened this issue Sep 1, 2023 · 12 comments
Closed

Pose model output 39x5 #4752

charlieforward9 opened this issue Sep 1, 2023 · 12 comments
Assignees
Labels
legacy:pose Pose Detection related issues platform:ios MediaPipe IOS issues type:support General questions

Comments

@charlieforward9
Copy link

charlieforward9 commented Sep 1, 2023

Have I written custom code (as opposed to using a stock example script provided in MediaPipe)

Yes

OS Platform and Distribution

iOS

MediaPipe version

No response

Bazel version

No response

Solution

Pose

Programming Language and version

Dart Flutter

Describe the actual behavior

Inspired by this repo, and the Mediapipe TFLITE pose solution, we are set up with an output of 39x5 instead of the expected 33x5 indicated in this document. The aforementioned repo only displays the first 33 points on the UI, but I am curious to learn about what the last 6 values are at the end of each tensor. I could not find this information anywhere in the docs.

Describe the expected behaviour

No response

Standalone code/steps you may have used to try to get what you need

No response

Other info / Complete Logs

Example Output

134.13479614257812,32.75078201293945,-221.4569854736328,13.385955810546875,12.922725677490234137.10333251953125,28.119789123535156,-216.42001342773438,12.21908950805664,11.891040802001953
138.84799194335938,27.963329315185547,-216.54434204101562,12.152606964111328,12.200130462646484
140.6144256591797,27.956462860107422,-216.52345275878906,12.557365417480469,12.056114196777344
131.8048095703125,28.35396957397461,-215.18161010742188,12.291149139404297,11.17852783203125
130.14910888671875,28.45285415649414,-215.30712890625,12.208850860595703,11.237407684326172
128.47409057617188,28.515979766845703,-215.3704071044922,12.258846282958984,10.900928497314453
143.97186279296875,29.890518188476562,-167.12005615234375,13.024253845214844,12.742427825927734
126.91309356689453,30.42294692993164,-160.8507080078125,11.137046813964844,10.699592590332031
137.84341430664062,37.661556243896484,-200.6927490234375,13.68783187866211,14.103363037109375
131.56275939941406,37.99659729003906,-199.65512084960938,12.965747833251953,12.52608871459961
158.85052490234375,60.239803314208984,-122.12614440917969,13.342771530151367,15.401704788208008
112.55911254882812,56.80223846435547,-100.87210083007812,10.702030181884766,11.333660125732422
167.490478515625,94.34288787841797,-111.26132202148438,7.27061653137207,12.09546184539795
76.6029281616211,76.36183166503906,-103.23043823242188,4.866189956665039,9.218698501586914
162.1429443359375,124.98287963867188,-161.14395141601562,7.234169006347656,12.186721801757812
40.001914978027344,71.06187438964844,-165.63743591308594,5.090389251708984,8.885896682739258
162.7422637939453,134.98391723632812,-177.08865356445312,6.0546722412109375,10.720813751220703
28.92837905883789,67.81619262695312,-181.84738159179688,4.344581604003906,7.898340225219727
158.093017578125,133.26971435546875,-191.50537109375,6.202449798583984,10.584671020507812
28.20886993408203,66.64647674560547,-197.94186401367188,4.552463531494141,7.78709602355957
156.73892211914062,129.43125915527344,-167.70697021484375,5.713146209716797,11.292427062988281
33.71411895751953,68.51156616210938,-174.68661499023438,4.44305419921875,8.636816024780273
141.4304962158203,121.50083923339844,0.1505880355834961,10.897388458251953,13.20727825164795
116.56924438476562,118.27957153320312,-0.23927783966064453,10.557445526123047,12.39103889465332
141.67721557617188,176.028076171875,-23.55157470703125,3.4209365844726562,10.327066421508789
118.32316589355469,167.36068725585938,-44.29826354980469,3.866872787475586,10.209527969360352
136.56163024902344,221.1106414794922,54.67216491699219,3.489116668701172,9.700339317321777
121.0483627319336,219.6265106201172,20.211471557617188,3.91546630859375,9.02173900604248
134.63148498535156,226.29901123046875,58.64472961425781,1.40594482421875,8.404857635498047
122.4677734375,226.418701171875,24.169479370117188,1.6010303497314453,8.481151580810547
133.93780517578125,239.70960998535156,-5.039459228515625,3.696338653564453,7.2691144943237305
120.84585571289062,237.5537872314453,-39.870635986328125,3.8709983825683594,6.582742691040039
128.98902893066406,119.67774963378906,-0.03412598371505737,-16.59803009033203,17.2281494140625
141.5630645751953,3.2175865173339844,-0.9088650345802307,-16.98242950439453,17.27218246459961
159.80050659179688,133.70938110351562,0.5654455423355103,-16.582294464111328,10.966882705688477
162.26556396484375,125.01246643066406,0.6279696822166443,-16.716691970825195,12.518455505371094
28.511804580688477,66.93032836914062,-0.043221503496170044,-16.528039932250977,8.99822998046875
40.070743560791016,71.04656982421875,0.9841947555541992,-16.71556282043457,9.626922607421875

@charlieforward9 charlieforward9 added the type:support General questions label Sep 1, 2023
@charlieforward9
Copy link
Author

cc// @DrBubbles42

@kuaashish kuaashish assigned kuaashish and unassigned ayushgdev Sep 4, 2023
@kuaashish kuaashish added legacy:pose Pose Detection related issues platform:ios MediaPipe IOS issues type:support General questions and removed type:support General questions labels Sep 4, 2023
@kuaashish
Copy link
Collaborator

@charlieforward9,

We have not yet officially introduced support for Flutter, and it remains an item on our roadmap. You can monitor the ongoing development progress in our tracking system, where it is categorised as "Work In Progress." here.

Regrettably, we are currently unable to specify a timeframe for when this support will become available. It is important to note that the repository you are following is maintained by a member of the community. To address any issues you are encountering, it would be advisable to raise the matter in the pertinent repository associated with the issue you are facing.

Thank you

@kuaashish kuaashish added the stat:awaiting response Waiting for user response label Sep 4, 2023
@charlieforward9
Copy link
Author

@kuaashish Thank you for the details. I have been following the development of the Dart buildout, and look forward to its release.

However, as I am using the TFLITE embedding, I figured it was platform agnostic and the 39x5 output was some additional data points that were a part of the model when it was "in its prime" 2 years ago.

I am hearing this is not the case, and that the Pose model output was always a 33x5 array.

Thank you.

@google-ml-butler google-ml-butler bot removed the stat:awaiting response Waiting for user response label Sep 4, 2023
@schmidt-sebastian
Copy link
Collaborator

@yichunk Do you know the answer to this?

@kuaashish kuaashish removed their assignment Sep 7, 2023
@kuaashish kuaashish added the stat:awaiting googler Waiting for Google Engineer's Response label Sep 7, 2023
@LMR2018
Copy link

LMR2018 commented Sep 11, 2023

Hello, do you know the answer to this question?

@kuaashish
Copy link
Collaborator

@charlieforward9,

The Pose World Landmarks can be found within the output at index 4.

To access the tensors_to_pose_landmarks_and_segmentation.pbtxt file within the Pose Solution's pbtxt files, please follow the initial graph files. Within this file, examine the SplitTensorVectorCalculator, which produces several tensors, including the world_landmarks_tensor.

These tensors provide the following information:

  • landmarks: 2D landmarks.
  • pose_flag: Indicates whether the pose is present in the image.
  • segmentation: Foreground/background segmentation mask.
  • heatmap: Heatmap for 33 landmarks to refine those predicted in the 2D output.
  • world_landmarks: 3D landmarks in metric space.

Should you incorporate the pprint.pprint(output) function directly within your code, you will observe the subsequent output:

[{'dtype': <class 'numpy.float32'>,
  'index': 337,
  'name': 'Identity',
  'quantization': (0.0, 0),
  'quantization_parameters': {'quantized_dimension': 0,
                              'scales': array([], dtype=float32),
                              'zero_points': array([], dtype=int32)},
  'shape': array([  1, 195], dtype=int32),
  'shape_signature': array([  1, 195], dtype=int32),
  'sparsity_parameters': {}},
 {'dtype': <class 'numpy.float32'>,
  'index': 342,
  'name': 'Identity_1',
  'quantization': (0.0, 0),
  'quantization_parameters': {'quantized_dimension': 0,
                              'scales': array([], dtype=float32),
                              'zero_points': array([], dtype=int32)},
  'shape': array([1, 1], dtype=int32),
  'shape_signature': array([1, 1], dtype=int32),
  'sparsity_parameters': {}},
 {'dtype': <class 'numpy.float32'>,
  'index': 297,
  'name': 'Identity_2',
  'quantization': (0.0, 0),
  'quantization_parameters': {'quantized_dimension': 0,
                              'scales': array([], dtype=float32),
                              'zero_points': array([], dtype=int32)},
  'shape': array([  1, 256, 256,   1], dtype=int32),
  'shape_signature': array([  1, 256, 256,   1], dtype=int32),
  'sparsity_parameters': {}},
 {'dtype': <class 'numpy.float32'>,
  'index': 298,
  'name': 'Identity_3',
  'quantization': (0.0, 0),
  'quantization_parameters': {'quantized_dimension': 0,
                              'scales': array([], dtype=float32),
                              'zero_points': array([], dtype=int32)},
  'shape': array([ 1, 64, 64, 39], dtype=int32),
  'shape_signature': array([ 1, 64, 64, 39], dtype=int32),
  'sparsity_parameters': {}},
 {'dtype': <class 'numpy.float32'>,
  'index': 339,
  'name': 'Identity_4',
  'quantization': (0.0, 0),
  'quantization_parameters': {'quantized_dimension': 0,
                              'scales': array([], dtype=float32),
                              'zero_points': array([], dtype=int32)},
  'shape': array([  1, 117], dtype=int32),
  'shape_signature': array([  1, 117], dtype=int32),
  'sparsity_parameters': {}}]

This observation signifies the presence of a tensor located at index 4 within the output variable, possessing dimensions of [1, 117]. This tensor, denoted as the "world_landmarks" tensor, comprises 33 keypoints, with an additional 6 for debugging purposes. Each keypoint is represented by coordinates in the (x, y, z) order, accounting for the overall size of [1, 117].

@kuaashish kuaashish assigned kuaashish and unassigned yichunk Sep 11, 2023
@kuaashish kuaashish added stat:awaiting response Waiting for user response and removed stat:awaiting googler Waiting for Google Engineer's Response labels Sep 11, 2023
@LMR2018
Copy link

LMR2018 commented Sep 11, 2023

@kuaashish How to convert the world coordinates to the pixel coordinates of the picture?

(1, 117)
[[ 1.48026943e-02 -5.98276973e-01 -2.04643250e-01 1.37524456e-02
-6.26582146e-01 -2.21675396e-01 1.39739662e-02 -6.27158165e-01
-2.21012354e-01 1.35345124e-02 -6.27387524e-01 -2.21095324e-01
1.09504461e-02 -6.22659683e-01 -2.13589668e-01 1.13124251e-02
-6.23600006e-01 -2.14592934e-01 1.04968250e-02 -6.24986410e-01
-2.13641644e-01 2.25660801e-02 -5.95743656e-01 -2.14721918e-01
1.25169754e-03 -5.86930275e-01 -1.92970276e-01 1.88541412e-02
-5.69226503e-01 -1.95325851e-01 1.21278763e-02 -5.65126359e-01
-1.88166142e-01 4.08923626e-02 -4.72871304e-01 -1.94905207e-01
2.03198195e-02 -4.79958057e-01 -1.21806860e-01 1.15839958e-01
-4.41007137e-01 -2.05171227e-01 1.06580257e-01 -4.70108032e-01
-7.59732127e-02 1.39840603e-01 -5.61654806e-01 -2.07411528e-01
1.38329029e-01 -5.72790623e-01 -5.27213812e-02 1.40963554e-01
-5.77191949e-01 -2.19899893e-01 1.60604954e-01 -5.83572388e-01
-5.44798374e-02 1.39861345e-01 -5.88037491e-01 -2.15969086e-01
1.40260696e-01 -5.96107483e-01 -6.51528835e-02 1.41887844e-01
-5.70112109e-01 -2.04580545e-01 1.36584759e-01 -5.81240654e-01
-5.71484566e-02 2.11625099e-02 7.05716014e-03 -3.72881889e-02
-2.12016106e-02 -7.82603025e-03 3.86805534e-02 -7.09004402e-02
1.30949020e-01 -5.47969341e-03 -1.91941261e-02 1.39912128e-01
1.18061200e-01 -7.74564743e-02 3.54447842e-01 1.36296272e-01
-3.96537781e-02 3.21997523e-01 2.74263859e-01 -7.40270615e-02
3.81817460e-01 1.49265289e-01 -4.69522476e-02 3.48016143e-01
2.89239407e-01 -1.17719650e-01 4.22744036e-01 1.35585785e-01
-1.08781815e-01 3.85118008e-01 2.84455776e-01 2.05600262e-03
3.22021544e-04 -9.41365957e-04 1.92906037e-02 -8.04975852e-02
-1.79793283e-01 6.64976165e-02 1.18397407e-01 -1.18872344e-01
1.15461573e-02 4.42665629e-02 1.30689263e-01 -1.26634508e-01
1.38391450e-01 1.00455478e-01 9.39180627e-02 1.22111320e-01
-4.83857542e-02]]

@kuaashish
Copy link
Collaborator

@charlieforward9,

Could you kindly review the aforementioned comment and provide us with an update on whether the issue has been resolved from your end? May we proceed to mark this matter as resolved and close it? Thank you

@kuaashish kuaashish added stat:awaiting response Waiting for user response and removed stat:awaiting response Waiting for user response labels Sep 14, 2023
@charlieforward9
Copy link
Author

Thank you very much for the response. I will leave it to @DrBubbles42 to confirm whether this resolves his concerns. I was asking this on his part.

@google-ml-butler google-ml-butler bot removed the stat:awaiting response Waiting for user response label Sep 14, 2023
@DrBubbles42
Copy link

@kuaashish Thank you for your response. This answers my question and we can close the issue.

@kuaashish
Copy link
Collaborator

Thank you for your confirmation. We are now closing this issue as resolved.

@google-ml-butler
Copy link

Are you satisfied with the resolution of your issue?
Yes
No

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
legacy:pose Pose Detection related issues platform:ios MediaPipe IOS issues type:support General questions
Projects
None yet
Development

No branches or pull requests

7 participants